SEO Blocking Search Engines From Pages

SEO Blocking Search Engines From Pages



Contrary to​ popular belief,​ the​ search engine spiders sent out by the​ major search engines do not have to​ search everything on​ a​ site. you​ can actually technically keep a​ search engine spider away from a​ page by instructed it​ through a​ certain robots meta tag or​ a​ file not to​ come near the​ page.

Webmasters can instruct spiders not to​ crawl certain files or​ directories through the​ standard robots.txt file in​ the​ root directory of​ the​ domain. Additionally,​ a​ page can be explicitly excluded from a​ search engine's database by using a​ robots meta tag. if​ for some reason you​ do not want a​ search engine spider to​ crawl a​ page you​ do have the​ means to​ do so.

When a​ search engine visits a​ site,​ the​ robots.txt located in​ the​ root folder is​ the​ first file crawled. the​ robots.txt file is​ then parsed,​ and only pages not disallowed will be crawled. However this is​ not always fool proof. Search engine spiders have a​ habit of​ going away from a​ page and then coming back and looking at​ the​ page a​ second time later. as​ a​ search engine crawler may keep a​ cached copy of​ this file,​ it​ may on​ occasion crawl pages a​ webmaster does not wished crawled.

Pages that most webmasters prefer not be crawled include login specific pages such as​ shopping carts and user-specific content such as​ search results from internal searches. Other pages that you​ might not want crawled,​ depending on​ the​ content might be a​ guest book that you​ expect to​ be filled with spam or​ a​ feedback system that is​ not very flattering to​ you. it​ is​ also a​ good idea to​ instruct the​ spiders not to​ crawl a​ page with a​ lot of​ animation or​ flash on​ it​ as​ this can be mistakenly read by a​ spider as​ a​ malfunctioning site.




You Might Also Like:




No comments:

Powered by Blogger.