Reputation: 428
Related question: Multiple User Agents in Robots.txt
I'm reading a robots.txt file on a certain website and it seems to be contradictory to me (but I'm not sure).
User-agent: *
Disallow: /blah
Disallow: /bleh
...
...
...several more Disallows
User-agent: *
Allow: /
I know that you can exclude certain robots by specifying multiple User-agents, but this file seems to be saying that all robots are disallowed of a bunch of files but also allowed to access all the files? Or am I reading this wrong.
Upvotes: 0
Views: 6632
Reputation: 96707
This robots.txt is invalid, as there must only be one record with User-agent: *
. If we fix it, we have:
User-agent: *
Disallow: /blah
Disallow: /bleh
Allow: /
Allow
is not part of the original robots.txt specification, so not all parsers will understand it (those have to ignore the line).
For parsers that understand Allow
, this line simply means: allow everything (else). But that is the default anyway, so this robots.txt has the same meaning:
User-agent: *
Disallow: /blah
Disallow: /bleh
Meaning: Everything is allowed except those URLs whose paths start with blah
or bleh
.
If the Allow
line would come before the Disallow
lines, some parsers might ignore the Disallow
lines. But, as Allow
is not specified, this might be different from parser to parser.
Upvotes: 1