Reputation: 6422
I don't want any search search engines to index most of my website.
I do however want search engines to index 2 folders ( and their children ). This is what I set up, but I don't think it works, I see pages in Google that I wanted to hide:
Here's my robots.txt
User-agent: *
Allow: /archive/
Allow: /lsic/
User-agent: *
Disallow: /
What's the correct way to disallow all folders, except for 2 ?
Upvotes: 9
Views: 5959
Reputation: 3502
I gave a tutorial about this on this forum here. And in Wikipedia here
Basically the first matching robots.txt pattern always wins:
User-agent: *
Allow: /archive/
Allow: /lsic/
Disallow: /
But I suspect it might be too late. Once the page is indexed it's pretty hard to remove it. The only way is to shift it to another folder or just password protect the folder. You should be able to do that in your hosts CPanel.
Upvotes: 13