Reputation: 1437
In theory I have two subdomains set up in my hosting:
subdomain1.mydomain.com
subdomain2.mydomain.com
subdomain2
has a CNAME
record pointing to an external service.
mydomain.com
has a robots.txt
that allows indexing everything.
subdomain2.mydomain.com
has a robots.txt
that allows indexing nothing due to the CNAME
record.
If I set up a forward from subdomain1.mydomain.com
to subdomain2.mydomain.com
, which robots.txt
would be used if accessing a link to subdomain1.mydomain.com
? Does the domain forward work in the same way as a CNAME
record when it comes to robots.txt
?
Upvotes: 0
Views: 510
Reputation:
The challenge you're running into is you're looking at things from the standpoint of whatever software you're trying to configure, but search engines and other robots only see the document they load from a URL (just like any other user with a web browser would). That is, search engines will try to load http://subdomain1.mydomain.com/robots.txt
and http://subdomain2.mydomain.com/robots.txt
, and it's up to you (through configuring whatever software your server is running) to ensure that those are in fact serving what you want.
A CNAME is just a way to add a redirection when loading what IP a browser should look at to resolve a domain name. A robot will use it when resolving the name to find out the "real" IP to connect to, but it doesn't have any further bearing on what the GET /robots.txt
request does once it connects to the server.
In terms of "forwarding", that term can mean different things, so you'd need to know what a browser or robot would receive when it requested the page. If it's doing a 301 or 302 redirection to send the client to another URL, you'll probably get different results from different search engines on how they may honor that, particularly if it's being redirected to an entirely different domain. I probably would try to avoid it, just because a lot of robots are poorly written. Some search engines have tools to help you determine how their crawlers are reading your robots.txt
URLs, such as Google's tool.
Upvotes: 1
Reputation: 438
This depends on your server setup.
Take the following config, for example:
server {
server_name subdomainA.example.com;
listen 80;
return 302 http://subdomainB.example.com$request_uri;
}
In this case, we're redirecting everything from subdomainA.example.com
to subdomainB.example.com
. This will include your robots.txt
file.
However, if your configuration is set up to only redirect certain parts, your robots.txt
file will only be redirected if it's on your list. This would be the case if you were redirecting only, say, /someFolder
.
Note that if you don't return a 302 but just use a different root (e.g. subdomainA
and subdomainB
are different subdomains but serve the same content), your robots.txt
content will be determined by the root directory.
So, therefore, if I'm understanding your config correctly, subdomain1
will use the the robots.txt
from subdomain2
.
Upvotes: 1