Cloaking



What is cloaking?
How is cloaking implemented?
Agent Name Delivery
IP Address Delivery
Penalty for Cloaking


Cloaking Warning

Cloaking is an unethical method (SPAM) to improve the ranking of a website in the search engines. Cloaking is discussed here only for education purposes. You are advised NOT to implement cloaking as a SEO method on any of your websites. If you do so it will in fact hurt your rankings in the search engines. Your site may even get banned by the search engines. Sorry! We do not teach, implement or endorse any of the unethical search engine optimization methods.


What is cloaking?

Cloaking is a technique that delivers visitors a different page from the one listed within the search engine or directory. It is primarily used to show an optimized page to the search engines and a different page to humans at the same URL. In other words, browsers such as Netscape and MSIE are served one page, and spiders visiting the same address are served a different page. Most search engines will penalize a site if they discover that it is using cloaking. Some go even further. They delete your site from the index if you are caught using cloaking.

How is cloaking implemented?

There are two important methods of delivering cloaked pages. This is done by either looking at the IP addresses of who is requesting the page, or by looking at the User-Agent HTTP header. The two methods are appropriately named as "Agent Name Delivery" and "IP Address Delivery." To effectively cloak a web page, the web server must be able to determine if the visitor is a human or a search engine.

Let's look at Agent Name Delivery first.

Agent Name Delivery

Your first step is to read your web server's log files to analyse the traffic to your site. It's a skill you are going to need if you want to cloak. The first clue in spotting spiders is to look at your log files for the requests that have been made for your robots.txt file. The robots.txt file is usually the first thing a spider will look for when it visits your site. Humans rarely ever want to look at your robots.txt file, you can generally assume that anything requesting this file is a spider. However, you can also identify the spider simply by looking at the agent name from the request that was made for your robots.txt file. A few common agent names are listed below.

AltaVista = Scooter
Excite = Architext
Google = Googlebot
Inktomi = Slurp
Lycos = T-Rex
NorthernLight = Gulliver

If you do not have a robots.txt file look for errors in your log files for when agents requested the file and it did not exist.

All browsers and search engine spiders have a name. The user agent field for a human visitor usually lists what web browser software is being used, such as:

"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"

The user agent field for a search engine usually identifies the search engine robot, such as this user agent field for Yahoo Slurp

"Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"

With this knowledge, delivering a specific page based on agent name is a rather simple task. You simply utilize a web script that says something to this effect: if Agent Name equals A or B or C, serve Page-1 (the spider page), else serve Page-2, where Page-2 would be the viewer's page.

Next, we look at "IP Address Delivery"

IP Address Delivery

An IP address (Internet Protocol Address) is a numeric address which identifies your connection to the Internet. Search engine spiders not only have a name to identify themselves but also have an IP address. For example, web site traffic from 64.62.82.x is most likely to be a visit from Googlebot, the famous Google search engine spider. It is the IP address of just one Google's many many spiders.

Since you can 'sniff' for the IP address when someone visits your site, you can use this information to serve specific pages to the spiders. This method is more complicated than Agent Name Delivery because it requires you to maintain an exhaustive list of IP addresses. Also, IP addresses can change and new ones are always being added.

The advantage to IP Address Delivery is that someone can not 'fake' your IP address. Consequently, it is impossible for anyone to see the code that is presented to the spider.

But there are thousands of spiders crawling the web, and if you are cloaking using IP Delivery knowing who these spiders are is going to be very important to you. If your site is cloaked and an unrecognized spider visits, it's too late to worry about whether your cloaking script served up the right page.

Penalty for Cloaking

Cloaking is in violation of most search engine policies and is very likely to get your site banned. This is what Google says on its Information for Webmasters page:

"To preserve the accuracy and quality of our search results, Google may permanently ban from our index any sites or site authors that engage in cloaking to distort their search rankings."



What is cloaking?
How is cloaking implemented?
Agent Name Delivery
IP Address Delivery
Penalty for Cloaking