Reputation: 491
I am trying to crawl a website but the robots.txt has just the following line:
User-agent: *
Does it mean that it doesn't care if I crawl their website?
Upvotes: 1
Views: 504
Reputation: 96697
Yes, if User-agent: *
is the only line in the robots.txt, you are allowed to crawl everything.
Only Disallow
lines have the power to list (beginnings of) URL paths that must not be crawled. If a robots.txt has no Disallow
lines, nothing is disallowed.
That said, the author of that robots.txt may have made an error. User-agent
lines are typically followed by Disallow
lines (or others, like Allow
etc.). There is no point in starting a record¹, but not stating anything for the matched user agents.
¹ A record starts with one or multiple User-agent
lines, and is separated from other records by a blank line. User-agent: *
matches all user agents not matched by any other User-agent
line in that robots.txt.
Upvotes: 2