Reputation: 2135
I am building a bot-trap / crawler-trap to my website:
There is a hidden link somewhere on the page, which normal users do not see, but a robot sees. The link is also indicated in robots.txt
, so Google
will not fall into the trap.
When a bot opens the hidden page, the IP automatically gets red-flagged in MySQL
.
My question is:
.htaccess
file, with the new IP added to it, so the webserver itself is going to do the blocking..htaccess
, rather look up the IP
table from MySQL
every time someone loads the page, and then decide in PHP
what to do with the user.Upvotes: 1
Views: 1281
Reputation: 4996
Which way is better? That highly depends on what you're able to do. The rules of thumb are:
.htaccess
files - configure your server directly.Everything else is just crawling which you can do, but you should rest assured that you do it because you can't do it right. So do not care too much unless you strive for the best.
Upvotes: 0
Reputation: 211540
Instead of messing around with Apache httpd configuration, which if your script gets wrong for whatever reason would crash your web stack, what about integrating with a system like fail2ban?
Blocking using a banning tool would be far more effective.
Upvotes: 1
Reputation: 12826
I would definitely go with option 2. The only reason being that I would be very uncomfortable that I have a .htaccess file being played with at random intervals in the website all the time. It is a nagging feeling like having a gun on my forehead all the time.
If it is db driven the worst that can happen in a screw up is that some black listed IP got access still. With htaccess if there is a screwup, not only does every user's experience get messed up, secure data can be compromised as well..
Upvotes: 2