Reputation: 2169

How do you prevent crawling from your web site?

I am running a website on IIS with more than 1000 page links at pagination and I want to prevent others to crawl/steal these pages by running a crawler script and get the info page by page.

Is there any way to understand the request if it is a user request or being ran by a script? or maybe some filters for this on highest level before coming to request?

Upvotes: 0

Answers (1)

Alexei Levenkov

Reputation: 100527

You can't prevent automated crawling.

You can make it harder to automatically crawl your content, but if you allow users to see the content it can be automated (i.e. automating browser navigation is not hard and computer generally don't care to wait long time between requests).

One option is to require single "user" (either authenticated or not) to have some minimal delay between requests (i.e. 1-5 seconds). This way generic crawling will not be useful (require some "user id" in request and delay between requests), and one would have to write custom crawling code which is clearly more time intensive.

Note that writing special "crawler" for your site may be considered as "noble" action and significantly increase incentive to create one (i.e. check out "how to make Google maps available offline" questions).

Upvotes: 1

How do you prevent crawling from your web site?

Answers (1)

Related Questions