Graham
Graham

Reputation: 309

How to block a website from crawling my site without knowing their IP address

There's a spam site that is an exact replica of my site. They continuously crawl my site and literally update / add content within 20 min (literally all 30k+ urls). After some research, I'm positive that they're crawling my site and storing it on their server.

They use CloudFlare which makes it so I can't know their true IP address. Can I somehow block them from crawling my site (VIA .htaccess or something) just by knowing the domain name?

Upvotes: 0

Views: 1118

Answers (1)

IMSoP
IMSoP

Reputation: 97638

It's entirely possible the server they run their crawling script from is completely separate from the server they host their clone on, even if they weren't using Cloud Flare.

However, if they're crawling all that content, it should be pretty obvious in your server's access logs. If you don't know where those are, talk to your hosting provider. Then look for the most common IP addresses listed, and try blocking them with something like this:

Order Allow,Deny
Allow from All
Deny from x.x.x.x

Upvotes: 2

Related Questions