JustJeffy
JustJeffy

Reputation: 97

Robots.txt file to allow all root php files except one and disallow all subfolders content

I seem to be struggling with a robots.txt file in the following scenario. I would like all root folder *.php files to be indexed except for one (exception.php) and would like all content from all subdirectories of the root folder not to be indexed.

I have tried the following, but it allows accessing php files in subdirectories even though subdirectories in general are not indexed?

....

# robots.txt 
User-agent: *
Allow: /*.php
disallow: /*
disallow: /exceptions.php

....

Can anyone help with this?

Upvotes: 1

Views: 2057

Answers (1)

unor
unor

Reputation: 96567

For crawlers that interpret * in Disallow values as wildcard (it’s not part of the robots.txt spec, but many crawlers support it anyway), this should work:

User-agent: *
Disallow: /exceptions.php
Disallow: /*/

This disallows URLs like:

  • https://example.com/exceptions.php
  • https://example.com//
  • https://example.com/foo/
  • https://example.com/foo/bar.php

And it allows URLs like:

  • https://example.com/
  • https://example.com/foo.php
  • https://example.com/bar.html

For crawlers that don’t interpret * in Disallow values as wildcard, you would have to list all subfolders (on the first level):

User-agent: *
Disallow: /exceptions.php
Disallow: /foo/
Disallow: /bar/

Upvotes: 1

Related Questions