Reputation: 3671
I'm getting some directories queried on Google, Bing, etc. that I don't necessarily want the world to see. How can I prevent it from crawling these pages/directories? Also how do I remove previous entries?
Upvotes: 1
Views: 258
Reputation: 20235
Most search engines first check for a robots.txt
file before they start crawling your site. If don't want it to crawl certain directories, create a robots.txt
file in your root directory and add this to it:
User-agent: *
Disallow: /my_private_dir
If you want an example robots.txt
file, here is stackoverflow's.
Upvotes: 1
Reputation: 104050
The friendly web crawlers (Google, Bing, Yahoo, Baidu, etc.) will respect your robots.txt
file. An example from the very helpful http://www.robotstxt.org/:
User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /junk/
Of course, if you really want to restrict your private content, you'd be better served by using your webserver's authentication and authorization tools or restrict access by address.
Upvotes: 2