henry.oswald
henry.oswald

Reputation: 5434

robots.txt disallow: spider

I'm looking at a robots.txt file of a site I would like to do a one off scrape and there is this line:

User-agent: spider

Disallow: /

Does this mean they don't want any spiders? I was under the impression that * was used for all spiders. If true this would of-course stop spiders such as google.

Upvotes: 0

Views: 778

Answers (1)

Arnaud Le Blanc
Arnaud Le Blanc

Reputation: 99909

This just tells to agents that call themselves spider to be gently enough to not browse the site.

This has no special meaning.

robots.txt files are used only by robots, so a way to exclude all robots is to use a *:

User-Agent: *
Disallow: /

Upvotes: 2

Related Questions