Reputation: 11597
I've got an ajax rich website which has extensive _escaped_fragment_ portions for Ajax indexing. While all my _escaped_fragment_ urls do 301 redirects to a special module which then outputs the HTML snapshots the crawlers need (i.e. mysite.com/#!/content
redirects to mysite.com/?_escaped_fragment_=/content
which in turn 301s to mysite.com/raw/content
), I'm somewhat afraid of users stumbling on those "raw" URLs themselves and making them appear in search engines.
In PHP, how do I make sure only robots can access this part of the website? (much like StackOverflow disallows its sitemap to normal users, and only lets robots access it)
Upvotes: 0
Views: 112
Reputation: 944076
You can't, at least not reliably.
robots.txt
asks spiders to keep out of parts of a site, but there is no equivalent for regular user agents.
The closest you could come would be to try to keep a whitelist of acceptable ip addresses or user agents and serve different content based on that … but that risks false positives.
Personally, I'd stop catering for old-IE, scrap the #!
URIs and the escaped_fragment
hack, switch to using pushState
and friends, and have the server build the initial view for any given page.
Upvotes: 2