Reputation: 237
I am working on analytics and I am getting many in accurate results mostly because of either social media bots or other random bots like BufferBot,DataMinr etc from Twitter.
Is there any Web API/Database of all known bots available which I can use to check if it is a bot or human ?
Or is there any good way to block such kind of bots so that it doesn't effect the stats in terms of analytics?
Upvotes: 0
Views: 1653
Reputation: 400
There is no way to outright block ALL bots, it would be an insane amount of time spent, you could use a .htaccess file or a robots.txt, stopping google indexing the site is easy but blocking bot traffic can get complicated and act like a house of cards I suggest using this list of crawlers/web-bots http://www.robotstxt.org/db.html
Upvotes: 0
Reputation: 219924
You can link to a hidden page that is blocked by robots.txt. When visited, captures the user-agent and IP address of the bot and then appends one or both of them to a .htaccess file which blocks them permanently. It only catches bad bots and is automated so you don't have to do anything to maintain it.
Just make sure you set up the robots.txt file first and then give the good bots a fair chance to read it and update their crawling accordingly.
Upvotes: 1
Reputation: 1356
Create a file callled robots.txt
in your route and add the following lines:
User-agent: *
Disallow: /
Upvotes: 0