nullpointerexception
nullpointerexception

Reputation: 735

keep web crawlers out of your site

Is there any way in web development to ensure that web crawlers cannot crawl your website?

Upvotes: 2

Views: 751

Answers (4)

Matthew Lock
Matthew Lock

Reputation: 13476

You could also deny access based on the crawlers user agent, of course this assumes that the crawler uses a user agent different from a regular browser.

Upvotes: 1

Kangkan
Kangkan

Reputation: 15571

Use robots.txt to direct or allow/disallow robots from indexing your website.

Upvotes: 0

Quentin
Quentin

Reputation: 943527

Ensure? No.

You can ask politely with robots.txt (but they can be ignored), you can stick up barriers with CAPTCHA (but they can be defeated and impose a barrier to ordinary users), and you can monitor the behaviour of each visitor looking for bot patterns (but bots can proxy cycle and rate limit).

Upvotes: 3

Darin Dimitrov
Darin Dimitrov

Reputation: 1038780

You could place a robots.txt file with the following contents at the root of your site which will prevent the civilized robots from indexing it:

User-agent: *
Disallow: /

Notice that this won't prevent the uncivilized robots from indexing it. The only way to prevent them is using techniques such as Captcha.

Of course it is preferred to use a dedicated development machine which is not accessible from the internet while your site is under construction.

Upvotes: 1

Related Questions