Reputation: 1283
I have a secure link direction service I'm running (expiringlinks.co). If I change the headers in php to redirect my visitors, then facebook is able to show a preview of the website I'm redirecting to when users send links to one another via facebook. I wish to avoid this. Right now, I'm using an AJAX call to get the URL and javascript to redirect, but it's causing problems for users who don't use javascript.
Here are a number of ways I'd like to block facebook, but I can't seem to get working:
I've tried blocking the facebook bot (facebookexternalhit/1.0 and facebookexternalhit/1.1) but it's not working, I don't think they're using them for this functionality.
I'm thinking of blocking the facebook IP addresses, but I can't find all of them, and I don't think it'll work unless I get all of them.
I've thought of using a CAPTCHA or even a button, but I can't bring myself to do that to my visitors. Not to mention I don't think anyone would use the site.
I've searched the facebook docs for meta tags that would "opt-me out", but haven't found one, and doubt that I would trust it if I had.
Any creative ideas or any idea how to implement the ones above? Thank you so much in advance!
Upvotes: 9
Views: 9318
Reputation: 1
It can be done in nginx using geoip2 module.
# this block goes to http { part of config, for example
# /etc/nginx/conf.d/geoip.conf
geoip2 /usr/share/GeoIP/country_asn.mmdb {
# if you have some database update script, you can configure auto reload
# auto_reload 1h;
$geoip2_asn asn;
$geoip2_as_name as_name;
$geoip2_continent continent;
$geoip2_continent_name continent_name;
$geoip2_country country;
}
And use it in location
# put this in location
if ($geoip2_asn = "AS32934") {
return 402;
}
Upvotes: 0
Reputation: 3772
Try this - it works for me ...
<?php
$ua = $_SERVER['HTTP_USER_AGENT'];
if (preg_match('/facebookexternalhit/si',$ua)) {
header('Location: no_fb_page.php');
die() ;
}
?>
Upvotes: 2
Reputation: 422
You could try using a meta refresh instead of a javascript redirect. They work for all browsers and because the page still returns a 200 response any crawler should stop resolving there.
Upvotes: 0
Reputation: 163313
All you need to do is appropriately set up robots.txt.
http://www.robotstxt.org/robotstxt.html
Upvotes: -2
Reputation: 53
You could try to get the logfile of your Webserver, and search there for unusal useragents. (maybe containing facebook) Or, otherwise get the Logs and delete every containing internet explorer/firefox/opera... Then you should have only bots useragents in the end. Then you could search for the facebook one.
Upvotes: 0