Reputation: 919
I want to stop Crawler from crawling the subdomain tools.subdomain.com
I found a Snippet on the Internet which show following Rewrite Rule:
RewriteCond %{HTTP_USER_AGENT} (googlebot|bingbot|Baiduspider) [NC]
RewriteRule .* - [R=403,L]
How can i manage to block those Crawler on this subdomain, or just allow the current up to date Browser to visit the Subdomain? I Want to manage this through .htaccess, because not every crawler accepts the robots.txt. For the robots.txt i have following rewrite Condition.
RewriteCond %{HTTP_HOST} =testing.subdomain.com
RewriteRule ^robots\.txt$ /robots_testing.txt [L]
Cheers
Sven
Upvotes: 0
Views: 169
Reputation: 10263
It depends on your server layout.
Segregated subdomain
If the subdomain has its own document root, it's enough place an .htaccess file in the subdomain's document root and write the directives you specified in the htaccess file:
RewriteCond %{HTTP_USER_AGENT} (googlebot|bingbot|Baiduspider) [NC]
RewriteRule .* - [R=403,L]
Shared subdomain
If the subdomain is using the same document root as the toplevel domain, it's enough to add a RewriteCond
to the above:
RewriteCond %{HTTP_HOST} ^tools\.subdomain\.com$
RewriteCond %{HTTP_USER_AGENT} (googlebot|bingbot|Baiduspider) [NC]
RewriteRule .* - [R=403,L]
Please note (1): the syntax ^tools\.subdomain\.com$
is needed to match exactly the entire name of the host; besides, since it's a regular expression, dots must be escaped with a backslash.
Please note (2): the syntax of the last RewriteCond
may vary according to the bots you want to exclude.
Upvotes: 3