Vignesh T.V.
Vignesh T.V.

Reputation: 1860

IP Blacklisting Apache

I had an IP scanning through my website and my apache error logs showed them and I opened a question here: Apache Error Log spammed with an error

Now, after blocking that, a new IP is accessing my site (for same directories as the previous IP did before). The directory does not exist and all that is generated is random.

THE PROBLEM:

The new IP accessing is 66.249.74.73 and when I see the IP info here: http://www.infobyip.com/ip-66.249.74.73.html it shows that it is Google BOT. Now, I am confused. Should I block it or not?

And if I block these IPs and a new IP is doing the same thing again, should I keep blocking IPs like this? Isn't there any permanent solution?

I am using Apache in Ubuntu 15.10

UPDATE: Now, it is successfully able to go inside my website and crawl it (I have not indexed my site anywhere) Just building it.

 [Fri Nov 20 18:36:31.026761 2015] [core:info] [pid 19594] [client 66.249.74.73:57119] AH00128: File does not exist: /var/www/html/robots.txt
 [Fri Nov 20 18:36:31.446036 2015] [core:info] [pid 19595] [client 66.249.74.69:63983] AH00128: File does not exist: /var/www/html/company/v/19175398/\xce\xe4\xba\xba\xb5\xc2\xc0\xfb\xd4\xb4\xc3\xb3\xd2\xd7\xd3\xd0\xcf\xde\xb9\xab\xcb\xbe
 [Fri Nov 20 18:36:32.228918 2015] [core:info] [pid 19595] [client 66.249.74.69:63983] AH00128: File does not exist: /var/www/html/company/v/5146022/\xd5\xf2\xbd\xad\xca\xd0\xb5\xa4\xcd\xbd\xc7\xf8\xb9\xe2\xc3\xf7\xb8\xa8\xd6\xfa\xb2\xc4\xc1\xcf\xb3\xa7

I already opened up a question in SO but then the IP keeps generating and I don't know how to block all the generated IPs. (Is adding each one manually to the blacklist the only way?)

Upvotes: 1

Views: 358

Answers (2)

Mazaka
Mazaka

Reputation: 634

If the robots.txt isn't working you can also try using modrewrite in a .htaccess file

RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} AltaVista [OR]
RewriteCond %{HTTP_USER_AGENT} Googlebot [OR]
RewriteCond %{HTTP_USER_AGENT} msnbot [OR]
RewriteCond %{HTTP_USER_AGENT} Slurp
RewriteRule ^.*$ "http\:\/\/yourdomain\.com" [R=301,L]

Upvotes: 1

hjpotter92
hjpotter92

Reputation: 80629

All crawler bots go through the /robots.txt files. Create this file with following content:

User-agent: *
Disallow: /

and none of the bots will be crawling your site further.

You can read more about robots.txt here.

Upvotes: 1

Related Questions