Dónal
Dónal

Reputation: 187399

Sitemap/robots.txt configuration conflict

My robots.txt contains the following rules:

Disallow: /api/
Allow: /
Allow: /apiDocs

The /apiDocs URL is in the sitemap, but according to Google Webmaster Tools, these robots.txt rules prohibit it from being crawled. I want to prevent all URLs that match /api/* from being crawled, but allow the URL /apiDocs to be crawled.

How should I change my robots.txt to achieve this?

Upvotes: 0

Views: 287

Answers (1)

unor
unor

Reputation: 96737

  • Line breaks aren’t allowed in a record (you have one between your Disallow and the two Allow lines).

  • You don’t need Allow: / (it’s the same as Disallow:, which is the default).

  • You disallow crawling of /api/ (which is any URL whose path starts with "api" followed by a "/"), so there is no need for Allow: /apiDocs as it’s allowed anyway.

So your fallback record should look like:

User-Agent: *
Disallow: /login/
Disallow: /logout/
Disallow: /admin/
Disallow: /error/
Disallow: /festival/subscriptions
Disallow: /artistSubscription
Disallow: /privacy
Disallow: /terms
Disallow: /static
Disallow: /api/

When a bot is matched by this "fallback" record, it is allowed to crawl URLs whose paths start with apiDocs.

Upvotes: 1

Related Questions