Reputation: 259
I have been going through different forums and was wondering if this is correct. I am trying to disable bots from crawling queries only in specific subpages (e.g. www.website.com/subpage/?query=sample). I am trying to make sure /subpage/ does not get disallowed also. Please correct me if I am wrong.
File: robots.txt
User-agent: *
Disallow: /subpage/*?
Upvotes: 1
Views: 1074
Reputation:
According to what I see here, you are very close
User-agent: *
Disallow: /subpage/*?*
Allow: /subpage$
You can test this from the comfort of your own browser by using the appropriate add-on or extension.
Upvotes: 1
Reputation: 594
I do not think you can specify query string in the Disallow
. The value you set for Disallow
is referenced as Directory
in the documentation (not as URI
or URL
).
You can however achieve your objective by using Sitemap.xml
. You can exclude the URL
from sitemap that you do not want indexed.
Google Webmaster tools also gives a some amount of granular control over how query string parameters should be interpreted. Not sure if that serves your purpose
Upvotes: 0