Use the “hosts” file to block adverts and web tracking

Eliminate specific adverts from your browsing and hinder attempts that ill-mannered companies make to track your web browsing activity.

Many web pages include images or other content that is not drawn from the principal site being viewed but, instead, from some other, third-party, site. These third-party organizations can use cookies and the connection IP address to track a person's browsing habits across many different web sites. This technique might be considered undesirable for several reasons including a potential loss of personal privacy and the fact that the advertising banners use up bandwidth that you, the viewer, are possibly paying for. The technique described in this page can be used to block access to these sites by intercepting the references to them and returning some other information instead. The technique can be used on any Windows based or GNU/Linux based computer. It can probably be easily adapted for use with other operating systems as well.

Content revision history:
Article first written, early September 2004
Added information about Metronet, 2nd March 2005.

Introduction

This article explains how you can use the “hosts” file on your computer to block adverts and to hinder the attempts that rude corporations make to track your browsing activity.

The first thing you need to know, however, is that while it is possible to use the “hosts” file to block adverts, it is not the only method. The main advantage of the “hosts” file method is that it requires very little technical knowledge and it is simple enough for many computer users to do by themselves. If you only have a single computer to consider then this method might be adequate. Against the advantage of simplicity is the fact that this method is particularly difficult to keep up-to-date, especially if you have several computers and wish to block adverts for all of them.

There are at least four other methods that can be used to block advertising and to hinder or prevent web tracking. Naturally no method is perfect and each method has its own strengths and weaknesses. The four other methods that come to mind are:

  • Using specific advertising blocking software installed on your computer.
  • Using DNS to intercept and redirect attempts to contact certain domains. This method is discussed in another article on this web site: Using DNS to block adverts.
  • Using software on your own computer or within your own local network in conjunction with third party block-list sites.
  • Using a company called Metronet for your ADSL connection. This option is only available to people in the United Kingdom.

If you are managing a cluster of computers in an organization then you will most likely find that investing time to explore one of these other methods will give you more effective advert blocking than you could achieve merely by using the hosts file. Try performing an internet search using terms such as “ad block”, “junkbuster”, “blocking web adverts”.

Despite its drawbacks, the hosts file method does, nonetheless, have some usefulness and it can be used in conjunction with the other methods if necessary; just don't expect too much.

What is the hosts file?

Computers that are running one of the variants of Windows or GNU/Linux have a file that is called “hosts” and that is used to translate names of computers into IP numbers. On a GNU/Linux based machine the file is located in the /etc directory while on a Windows machine it will be located in the Windows directory.

The structure of the hosts file is very simple. Every line contains an IP number and an address so a file might, for example, contain lines like those shown below:

127.0.0.1          myowncomputer.local
66.32.247.23       acomputer.faraway.net

Don't copy these lines into your own hosts file. They are shown only so that you can see how simple the structure of the file is.

Whenever your computer is told to make contact with some other computer the first thing it must do is find the IP address of the computer it has been told to contact. On a Windows machine the first place it will check will usually be the hosts file. If it doesn't find the name in the hosts file it will make contact with a nameserver computer and ask that for the IP number. On a GNU/Linux machine the process is similar with the variation being that a GNU/Linux machine can be configured to check the nameserver before checking the hosts file or the nameserver after checking the hosts file. If your GNU/Linux system is configured to check the nameserver first then you will either need to change its configuration or find a different method of blocking advertisements.

The need for a webserver

The hosts file method of advertisement blocking works best if you have a web server that you can refer to. Many Windows machines have a miniature web server installed and running by default (actually that has been a major contributory cause of poor security with Windows machines but here you are going to get some advantage out of it). If you have GNU/Linux it is also possible that you will have a web server running on your own machine — many Linux distributions include two or three http servers. If your computer is connected to a corporate local network then it is possible that one of the computers on the network will have a web server running. One way or another you will usually need a web server of some form in order to get the best speed.

Once you have found a webserver you need to know its IP address. If the webserver is running on your own computer then you can use the address 127.0.0.1 and if you are not sure whether you have a web server or not this is also the number you should try. The IP address 127.0.0.1 is a special address that always refers to the machine you are using. If you are going to refer to a web server on your local network then its IP address will probably by something like 192.168.x.y or 10.x.y.z (where x, y and z are numbers between 0 and 255).

You need a web server to refer to so that your web browser does not hang around waiting to get a response from a computer that doesn't exist. If you do not have a web server to refer to then you might find that this method blocks the adverts but makes your browsing very slow when you encounter advertisements.

If you do not know whether you have a web server running on your own computer you can try the following experiment: Edit the hosts file and add the following line:

127.0.0.1    hoopy.local

Make sure you save the file. Now go to your web browser and type in the address “hoopy.local” and see what happens. If you immediately see a page that contains a “Error 404 page not found” message, or something similar, then you probably have a web browser running on your computer. If there is a delay of a few seconds and then just a blank page appears then you probably don't have a web browser running on your computer and you will either need to install one, find one on your local network, choose a different method of blocking advertisements, or put up with delays when you are visiting web pages that contain references to web sites that you have blocked.

If you don't have a web server running on your own computer then there are a couple of other things you can try: If your computer is part of a local area network such as in an office then it is possible that your IT administrator will be able to set up a web server for you. If your computer connects to the internet on an ADSL connection and you gave an ethernet modem (quite likely in a small office situation) then it is possible that you can use the ethernet modem as the web server ... see below.

Web servers for ADSL users

If you are using an ADSL connection and have an ethernet modem or some other moderately sophisticated box between your computer and your telecom providers line then it is possible that your ADSL modem will contain its own miniature web server. Web server in ADSL modems.

Adding advertisement servers to be blocked.

This is where the hosts file method really starts to get a bit tedious; how do you obtain a list of advertising servers and how do you keep it up to date?

You might be able to get a list of computers by doing a web search. In September 2003 a list was being maintained at pgl.yoyo.org/adservers/ but this site is not maintained by LearnLinux and it might or might not exist when you try to visit it.

Another way of finding the names of the advertising servers will be to look at the HTML source of the pages that contain advertisements you want to be rid of. This is somewhat tedious but not especially difficult. Most browsers have an option to view the HTML source. Some browsers, such as Opera, also make it easy for you to see the links in every page. If you are reviewing the HTML then the links to the advertising servers will be found by reviewing everything in the source that begins “” since this is the beginning of all web page URLs.

In any case what you are going to create is a list that looks something like that shown below. Please note that the computer names in this list were all invented and probably do not really exist in the world.

127.0.0.1   ads1.badcompany.com
127.0.0.1   ads.rude-corporation.com
127.0.0.1   adserver.we-spy-on-you.net

Of course if the web server you are referring to is not on your own computer then in each line you need to replace the 127.0.0.1 with the correct IP address for the server you are using.

The names of adservers will change moderately rapidly and so your hosts file will need to be updated from time to time. This is one of the big drawbacks of this method.

After you have added a new server name and saved the hosts file you should find that the new name is blocked immediately. However it is possible that your web browser or some other software on your computer will still remember the previous or original IP address and will continue to use it. Next time you start your browser or computer the previous IP address will be forgotten and the advertisement server will thenceforth be blocked.

Summary

The hosts file is one of several tools that can be used to block advertisements and hinder the attempts that are made by corporations to track your web browsing habits. It is not a perfect method but has the advantage of simplicity and speed of configuration. The method works best if you have a webserver running on your own computer or own local network so that you can redirect the advert requests to your own webserver and have them intercepted rapidly. Without a local web server this method will possibly be irritatingly slow.

Other precautions

This particular document discusses the use of the hosts file to block advertisements. There are, however, some other privacy and security related precautions that can be considered for each individual machine. These are:

Instruct your web browser to reject third-party cookies.

Instruct your web browser to discard cookies at the end of a browsing session

Instruct your web browser to ignore (refuse to open) pop‑up windows that were not explicitly requested by the user.

These three techniques will, by themselves, hinder (but not stop) the attempts to track your browsing habits and will relieve you from the tedium of some adverts. If these features are not available on your web browser then consider changing your web browser. The “Opera” browser (available for GNU/Linux and Windows) provides all of these features in a configurable form. As always, a search of Usenet will yield a variety of opinions about different browsers and their relative merits.

Links

pgl.yoyo.org/adservers/
A list of advertisement servers ready to be copied to your own hosts file.
Opera Software
Producers of a rather good web browser and email client. Versions are available for GNU/Linux, MS-Windows and certain mobile telephones and handheld computers. It has useful configuration options, is fast, and provides several useful features that make it easier to view and navigate web pages.
Mozilla
Producers of popular web browsers, email client and other things.
Advertisement blocking using DNS
The use of a nameserver to block advertisements is a method that is very suitable for creating an effective block for a cluster of computers on a network. It requires more technical knowledge than the hosts file method described in this article but, once running, it can be made effective for all computers on the network without having to constantly update each individual computer.

The following are links to organizations that provide other methods of blocking advertisements and unwanted material. You could use their methods instead of, or in addition to, your own DNS of the sort described in the above article. They are mentioned here only to give you something else to think about and they are not affiliated with or recommended by or otherwise associated with LearnLinux.co.uk (the site you are looking at now). There might be other organizations with similar products or with products that are more suitable for your needs. An internet search for terms such as “junkbuster”, “advertisement blocklist”, “ad blocking” will possibly find you some useful pages to explore.

guidescope.com
A company that provides software for blocking unwanted material.
junkbusters.com
An organization that provides software for blocking unwanted material.
Metronet
Metronet are an ADSL provider in the United Kingdom. Not only are Metronet give better value for money than most big-name ADSL providers but, of particular relevence to this article, they also offer a free, configurable, firewall at their end of your connection and (starting March 2005) allow you to browse the web via a configurable proxy server that you can use to block adverts, adult sites or other sites of your own choosing. The providers of LearnLinux (the website you are now viewing) have used Metronet for ADSL since late 2003 and have found their service to be not perfect but generally pretty good. If you in the United Kingdom, using an ADSL connection and want advertisment blocking and other protection for your connection but do not have sufficient technical skills to provide your own blocking facilities then (as at 2nd March 2005) Metronet are well worth considering.

Glossary

IP number:

An IP number or IP address is a number that identifies a computer on the internet or on your local network. In 2003 the most common form of IP number was the four octal dotted decimal. These are numbers that have the form w.x.y.z where w, x, y, and z are integer numbers in the range 0 to 255 inclusive. Examples:

     213.180.45.240

     66.35.112.1

URL

Uniform Resource Locator. This is a name used to identify a specific computer, web page, service or something else on the internet, or on your local network or even on your own computer. The names are constructed in a standard format to make them easy to analyse. A URL is something that a human can read, write and understand.

End of document

 

Navigation: (site map) learn linux home pagetechnical articles