Breaking into the World Wide Web

Escaping Internet Restrictions

 

 

Introduction

A common subject on the internet is how to escape internet censorship. This isn't just for oppressed dissidents in China; many people are subject to internet restrictions that block access to at least part of the web. People affected include the following groups :

  • People living (or working) in countries including :

    • Pakistan
    • Kuwait & other gulf nations
    • China
    • Burma
    • Probably a lot more Smile
  • People behind corporate firewalls at work

  • Students at high schools and universities

  • Some internet cafes

  • Many libraries now have internet filtering in place

Needless to say there are various approaches to hacking your way out of these kind of restrictions, and lots of people are busy trying.

In my current job (well, the one I spend four days a week at) we have very strict internet restrictions. These have varied over the last few years, and so I've had a fair bit of experience with different techniques for breaking out of our corporate firewall.

I even wrote my own CGI Proxy called approx which works fine under certain types of restriction. Judging by the number of people who find my website whilst searching for Start Using CGI Proxy [1], there are a lot of people trying to get unrestricted access to the net.

This is a tutorial, or HOWTO, on various techniques you could use to do this. It is not a comprehensive technical reference, but has several simple different things you can try. Good luck !

Types of Restriction

There are a few, slightly different, types of internet blocking in use. This is a brief description of the different categories.

  • Blacklists

    Your ISP (or company server) maintains a list of domains or URLs to block. These lists can be very large. Most home based filtering programs, like Netnanny and Cyberpatrol [2], work with a blacklist [3].

    Squid, a common cache server used by companies and organisations for internet filtering, has various blacklists that it can work with.

  • Content filtering

    Some restrictions work by checking the content as it is fetched. Certain words or phrases can trigger a block, as can images with too high a degree of flesh tone in them.

    Some of the proxy techniques listed here won't help you get past this, as the content may still be blocked even if the URL isn't. Using an encrypted connection (e.g. via https or ssh) will prevent that.

  • Whitelists

    This is the most severe form of internet restriction. Access is only allowed to a small set of sites on the whitelist.

    This can be very difficult to get past, but there are still techniques for the desperate.

For more information on different types of censoring and censoring tools, you may like to visit Peacfire.org.

Following are a few of the methods I've used to bypass the blocking in a variety of different situations.

Proxy Servers

All the solutions involve setting up or finding a 'proxy server'. These are a computer (or website) that you can reach, which will fetch the pages you can't reach.

These basically come in two types.

Note

If you are non-technical and just want some simple ways of breaking through a censoring firewall I would go straight to Web Based Proxies or Other Techniques.

The Proper Proxies and SSH - Secure Shell techniques take a bit of setting up to get them to work.

Web Based Proxies

These are typically cgi proxies, where you visit a website and enter the URL you want into a form. The proxy then fetches the page for you, but modifies all links, image and script tags so that they are also fetched through the proxy.

This means that the filtering program thinks you are visiting one website, but you are able to view the contents of the site you really want. Unless the proxy server is fetched across an https connection, it may still be blocked by content filtering systems.

There are lots of these available. The free ones tend to come and go [4], but there are some that have been around for years. Most of these tend to offer premium services as well, that they charge for. You can find good lists of them at Freeproxy.ru and Proxy.org.

The main disadvantages of these types of proxy are :

  • The URL you visit will be 'munged', but the real URL you are visiting may still be visible as part of the full URL. (And so be detectable in the server logs.)
  • Javascript usually breaks if it uses relative links that no longer apply.
  • The URL is no longer suitable for bookmarking and if you save the page it will have modified links/tags.

Approx attempts to get round some of these problems by allowing it to work with a client side proxy program, approxClientproxy. You configure your browser [5] to use the proxy server and it modifies the URLs for you. This means that normal browser requests work fine. So javascript works, you can bookmark URLs and the pages you fetch aren't modified. Approx [6] also has URL 'obfuscation', so that the real location you are visiting is not obvious in the server logs.

To use this, you will need to install it on your own server which has Python installed [7].

Proper Proxies

Proper proxy servers are ones that forward your web request to another location. When you fetch a web page it goes through the proxy, so that the filtering server can't tell what location you are really visiting.

These proxies may also remove all trace of you from the request you make, allowing you to browse anonymously. This means that the server you visit can't tell who you are either. You can find lots of resources about this on the internet. For general information about proxy servers, visit the wikipedia page.

There are a lot of these proxies to be found. Some of these are deliberately left open, and some are as a result of misconfigured servers that have accidentally left access open. You can get a list of these proxies from places like proxy4free and rosinstrument.

When I was able to use this kind of proxy I found a program called proxytools very useful. It keeps an updated list of available proxies and can switch between them as necessary. The author is very helpful at sorting out problems.

SSH - Secure Shell

There is always a vague risk that a proxy server might be a trap, setup to discover who is using it. The only way to be certain this isn't happening is to setup your own proxy. For this you will need access to a shell account somewhere on the internet that you can reach.

If you can do this, there is a slightly easier technique that will give you unrestricted freedom. The advantage of this (this applies generally to using proper proxies) is that it will allow you to access other internet services (such as IRC and peer to peer file sharing) that may also be blocked.

If you have a shell account you will almost certainly use SSH to login to it. The standard SSH client for windows is called Putty.

You can configure this to use Dynamic Port Forwarding. This Page briefly explains how to set it up.

This sets up your SSH connection as a SOCKS Proxy. You will need to configure your browser to use it. If your application can't use a SOCKS proxy (most browsers and p2p clients can), then you can also setup port forwarding for individual services.

The advantage of using SSH is that your http tunnel is protected with an industrial strength encryption algorithm.

Warning

There is a security hole with method. Although all the internet traffic goes across your encrypted connection, DNS lookups will still be done locally.

This means that it is theoretically possible for your network administrator to work out which sites you are visiting.

Other Techniques

If none of the techniques we've already discussed work for you, there are still one or two tricks available. Some of these may work, even in the most restricted environments.

If you have any more tricks or tips, then please Contact Me and I will add them to this section.

Happy browsing, and don't get up to anything naughty. Razz

SSH On a Non-Standard Port

If you can't reach an outside server (for example your network operates on a white-list only system), it is possible that your system administrator will still allow you to fetch POP email from an outside domain.

This usually involves allowing traffic to the email domain on port 110. If you setup your SSH server to serve on port 110 (and disable any POP server you run) then you can use the SSH - Secure Shell trick for unrestricted browsing.

Another trick is to setup a fake website, which you may be able to blag your sysadmin into putting on the whitelist. You can then setup SSH to serve on port 432 (normally used for https). This prevents you serving secure pages from a webserver on the same IP address.

Note

It is tricky to configure Squid to filter http and https. I have seen networks configured to block most http traffic that allow unfiltered https. That means that any proxy or CGI proxy running over https may be within reach.

JAP

JAP is an interesting service that obscures your internet traffic. It involves installing a program on your machine.

It works fine, but when I used it had the disadvantage that all your traffic appears to go to a single (German) site.

My suspicious network administrator wondered why I was only visiting one site, and went to see what it was...

The Google Cache

You may know that in order to index the whole internet, google keeps a copy of each webpage. This is called their Cache.

From results of search pages you can choose to view the cached version rather than the original. Often if pages are banned you can still view the cached page.

Some systems ban a lot of the web, but still allow you access to the google domains.

In this case you can take advantage of a clever little hack [8] called the googleCacheServer.

This works as a little proxy server on your computer (you need to configure your browser to use localhost on port 9080 as a proxy). You will need Python installed and a google license key to use it.

Note

If you can't (or don't want to) install programs, then you can use a version of Python called Movable Python instead.

Whenever you browse to a page, the cache server will attempt to fetch it from the cache [9].

Unfortunately the cache is usually a few days out of date. It doesn't contain every page, and doesn't store javascript, images, CSS etc. Browsing through the cache server is like going back in time ten years. Smile

You also can't do things like post messages, but it's still much better than nothing. It can also be useful if a website you want to visit is temporarily down.

Translating Proxies

You have probably seen online services that will translate text from one language to another. There are various of these, like Babelfish and Google Translate.

You may not have twigged that these can also be used as a CGI proxy. Smile

They can translate web pages from one language to another. If you enter the URL into the form, and select translate from Chinese -> English, you will get back a copy of the page. Again it will be without images etc, but better than nothing.

Your system administrator may be sympathetic if you ask for one of these to be added to the list of allowed sites, especially if you have a foreign relative and say that you occasionally need to translate emails.

Browsing by Email

This is a firewall crack for the truly desperate. Even if you have no access to the world wide web, you can still browse the internet by email. Yep, seriously. :lol

These work by sending an email to a certain address (a server somewhere), containing the web location (URL) you are after. The server then replies, usually after a bit of a delay, and emails the web page to you.

Because of the delay, and the hassle of putting together the email, this is not ideal. It can sometimes be a life saver though.

My favourite service is provided by a website called Pagegetter.com. Send an email to web@pagegetter.com, with the URL you want as the subject. For example, try this :

Voidspace via Pagegetter

Their service is the best because it includes the images in the return emails, and sets up the links so that you can browse just by clicking on them. Unfortunately they charge for unrestricted access, but allow you thirty free pages per month (per email address).

You can find more information on browsing the web by email, at expita and on the agora server help page

Some of these services even allow downloads by email, I still haven't found a filesharing client that works by email though. Wink


[1]The signature text of the James Marshall Perl CGIProxy, which is used a great deal.
[2]The first internet filter used at my workplace was Cyberpatrol. It was reasonably easy to break using a crack from the internet. Whether that still works I don't know.
[3]So somebody out there is paid to look for internet porn.
[4]They get heavy use, big bandwidth drains.
[5]See a page like this HOWTO on configuring your browser to use a proxy server.
[6]I haven't worked on approx for a while. You currently can't fetch https locations when using the client side proxy.
[7]Or pay me to set you one up. Smile
[8] Very Happy Written by me as it happens.
[9]You need access to the google domain api.google.com for this to work.

For buying techie books, science fiction, computer hardware or the latest gadgets: visit The Voidspace Amazon Store.

Hosted by Webfaction

Return to Top

Page rendered with rest2web the Site Builder

Last edited Sun Oct 01 20:10:54 2006.

Counter...


Voidspace: Cyberpunk, Technology, Fiction and More

IronPython in ActionIronPython in Action

Search this Site:
 
Web Site
Blogads

Follow me on:

Twitter

Del.icio.us

Shared Feeds

Hosting for an agile web