Reputation: 336
What happens when a Disallow line includes more than one URI? Example:
Disallow: / tmp/
I white space was introduced by mistake.
Is there a standard way in how web browsers deal with this? Do they ignore the whole line or just ignore the second URI and treat it like:
Disallow: /
Upvotes: 0
Views: 60
Reputation: 1752
Google, at least, seems to treat the first non-space character as the beginning of the path, and the last non-space character as the end. Anything in-between is counted as part of the path, even if it's a space. Google also silently percent-encodes certain characters in the path, including spaces.
So the following:
Disallow: / tmp/
will block:
http://example.com/%20tmp/
but it will not block:
http://example.com/tmp/
I have verified this on Google's robots.txt tester. YMMV for crawlers other than Google.
Upvotes: 1