Reputation: 735
Is there any way in web development to ensure that web crawlers cannot crawl your website?
Upvotes: 2
Views: 751
Reputation: 13476
You could also deny access based on the crawlers user agent, of course this assumes that the crawler uses a user agent different from a regular browser.
Upvotes: 1
Reputation: 15571
Use robots.txt to direct or allow/disallow robots from indexing your website.
Upvotes: 0
Reputation: 943527
Ensure? No.
You can ask politely with robots.txt (but they can be ignored), you can stick up barriers with CAPTCHA (but they can be defeated and impose a barrier to ordinary users), and you can monitor the behaviour of each visitor looking for bot patterns (but bots can proxy cycle and rate limit).
Upvotes: 3
Reputation: 1038780
You could place a robots.txt file with the following contents at the root of your site which will prevent the civilized robots from indexing it:
User-agent: *
Disallow: /
Notice that this won't prevent the uncivilized robots from indexing it. The only way to prevent them is using techniques such as Captcha.
Of course it is preferred to use a dedicated development machine which is not accessible from the internet while your site is under construction.
Upvotes: 1