Reputation: 305
This bot doesn't respect nofollow noindex
in robots.txt.
I have this in robots.txt:
User-agent: Msnbot
Disallow: /
User-Agent: Msnbot/2.0b
Disallow: /
Till now it was pretty slow, but now, it is a monster that won't leave my site at all. Crawls all WordPress and MyBB 24/7.
To block IP ranges or what can I do to stop all of this content stealers?
Upvotes: 2
Views: 4747
Reputation: 546
Though I was unable to identify specific bots that visit my site and spend 0:00 time per page, I was able to identify the countries where these attacks are coming from.
Since the attacks are mostly only coming from China and the US, I'm going to block those countries completely from visiting my website using my htaccess file. I hope it works.
I only recommend this if you know you only want traffic from your country and nowhere else, and you're sure you're not losing traffic that you want to get from countries you want to ban.
Here are the links to the tutorial:
https://www.countryipblocks.net/acl.php
I just implemented this now, I hope it works for me. It seems like a good solution for me because my Canadian traffic is good while the US and China traffic all seem to be attacks only.
Again, I recommend discretion when using a solution like this.
Upvotes: 0
Reputation: 395
Here's what you need to do instead:
Code:
User-agent: *
Disallow:
User-agent: MSNbot
Disallow: /
The above code allows all robots except MSNbot.
You can read more about the robots exclusion protocol here.
for example, for bing.
User-agent: MSNBot
Disallow: /
for google
User-agent: googlebot
Disallow: /
if you want block all bots. use this.
User-agent: *
Disallow: /
Upvotes: 0
Reputation: 6120
Based on Block by useragent or empty referer you could something like this in your .htaccess
Options +FollowSymlinks
RewriteEngine On
RewriteBase /
SetEnvIfNoCase User-Agent "^Msnbot" ban_agent
Deny from env=ban_agent
Upvotes: 3