jeph perro
jeph perro

Reputation: 6422

How to configure robots.txt file to block all but 2 directories

I don't want any search search engines to index most of my website.

I do however want search engines to index 2 folders ( and their children ). This is what I set up, but I don't think it works, I see pages in Google that I wanted to hide:

Here's my robots.txt

User-agent: *
Allow: /archive/
Allow: /lsic/
User-agent: *
Disallow: /

What's the correct way to disallow all folders, except for 2 ?

Upvotes: 9

Views: 5959

Answers (1)

T9b
T9b

Reputation: 3502

I gave a tutorial about this on this forum here. And in Wikipedia here

Basically the first matching robots.txt pattern always wins:

User-agent: *
Allow: /archive/
Allow: /lsic/
Disallow: /

But I suspect it might be too late. Once the page is indexed it's pretty hard to remove it. The only way is to shift it to another folder or just password protect the folder. You should be able to do that in your hosts CPanel.

Upvotes: 13

Related Questions