Crawling for Eternity

Question

I've recently been building a new web app dealing with Recurring Events. These events can recur on a daily, weekly or monthly basis.

This all is working great. But when I started creating the Event Browser Page (which will be visible to the public internet) a thought came across my mind.

If a crawler hits this page, with a next and previous button to browse the dates, it will just continue forever ? So I opted out of using generic HTML links and used AJAX. Which means that bots will not be able to follow the links.

But this method means I'm losing any that functionality for users without Javascript. Or is the amount of users without Javascript too small to worry ?

Is there a better way to handle this ?

I'm also very interested in how bots like the Google Crawler detects black holes like these and what it does to handle them ?

tripleee · Accepted Answer

Add a nofollow tag to the page, or to the individual links you don't want crawled. This can be in robots.txt or in the page source. See the Robots Exclusion Standard

You may still need to think about how to fend off ill-behaved bots which do not respect the standard.

Crawling for Eternity

Answers (2)

Related Questions