ishikun
ishikun

Reputation: 428

Multiple User-agents: * in robots.txt

Related question: Multiple User Agents in Robots.txt

I'm reading a robots.txt file on a certain website and it seems to be contradictory to me (but I'm not sure).

User-agent: *
Disallow: /blah
Disallow: /bleh
...
... 
...several more Disallows

User-agent: *
Allow: /

I know that you can exclude certain robots by specifying multiple User-agents, but this file seems to be saying that all robots are disallowed of a bunch of files but also allowed to access all the files? Or am I reading this wrong.

Upvotes: 0

Views: 6632

Answers (1)

unor
unor

Reputation: 96707

This robots.txt is invalid, as there must only be one record with User-agent: *. If we fix it, we have:

User-agent: *
Disallow: /blah
Disallow: /bleh
Allow: /

Allow is not part of the original robots.txt specification, so not all parsers will understand it (those have to ignore the line).

For parsers that understand Allow, this line simply means: allow everything (else). But that is the default anyway, so this robots.txt has the same meaning:

User-agent: *
Disallow: /blah
Disallow: /bleh

Meaning: Everything is allowed except those URLs whose paths start with blah or bleh.

If the Allow line would come before the Disallow lines, some parsers might ignore the Disallow lines. But, as Allow is not specified, this might be different from parser to parser.

Upvotes: 1

Related Questions