Reputation: 6693
I am sending HTML-formatted event invitation emails to customers with accept/decline links in them. It seems that some of these links are being pre-fetched though, resulting in the invitations being auto-declined. Oh goodie.
I have included the rel="nofollow"
attribute on the links to prevent email servers, clients or other lurking middlemen from 'clicking' the links. This has helped but not entirely eliminated the problem - I still have some bot pre-fetching the links when sent to outlook.com addresses at least
I would like to avoid requiring additional action on the part of email recipients after (genuinely) clicking these links, especially in the 'decline' case, so I see there being two avenues to solving this:
For the latter, it is not merely a case of checking the UserAgent header - the example I am presently seeing is:
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36
Admittedly an old version of Chrome already, but that's not a valid reason to ignore it! Unfortunately, I don't see anything else useful in the HTTP headers to tip me off either.
What I don't know yet is whether or not the pre-fetching culprit will attempt to execute any 'onload' Javascript code on the landing page. If not, that may be the ticket.
Any suggestions appreciated!
UPDATE
Have tried responding to initial query with a page triggering a client-side redirect either via a <meta>
tag or using a body onload event handler. Both were executed by the offending bot, so no joy there at least. I'm wondering if I'll have to sink to an invisible recaptcha to solve this one. Yuck.
Upvotes: 5
Views: 1525
Reputation: 11
I know this thread is old, but in case someone is still struggling with this, I'll share my solution. For fun, I was looking for a possible workaround and came up with an idea—I can check what user-agent Outlook uses and then block access to the page if it performs prefetching with that user-agent.
I used a simple script:
file_put_contents("SERVER PATH TO PREFETCH LOG FILE",
date("Y-m-d H:i:s") . " - User-Agent: " . ($_SERVER['HTTP_USER_AGENT'] ?? 'None') . PHP_EOL, FILE_APPEND);
Thanks to this, I found out that Outlook doesn’t send a user-agent at all. So, it was enough to block access to all "non-browsers" (i.e., requests without a user-agent). In my case, I placed the following code at the beginning of my PHP file:
if (!isset($_SERVER['HTTP_USER_AGENT']) || empty($_SERVER['HTTP_USER_AGENT'])) {
http_response_code(403);
exit;}
All known (for me) browsers use User-Agent, and so far, no one who used my website has had any issues
Upvotes: 1
Reputation:
I've had the same problem for an email where I asked clients to provide a rating. I noticed many spurious ratings and when I started to contact clients for confirmation they sometimes had no clue what I was talking about. So I can only assume that those ratings were due to some pre-fetching bot selecting a random link from within the email. The only reliable solution, as far as I can see, is to require visitors to do 1 more click on the landing page to confirm whatever action they are trying to perform, i.e., use a POST event as mentioned by Robert above.
Upvotes: 2
Reputation: 6693
I ultimately did not find any silver bullet for this problem and had to add interstitial pages into my workflow, where invitees need to confirm their acceptance or declining of invitations on my site via an additional click after following the links within the invitation emails. Not what I ultimately wanted, but I found myself backed into a corner with this one unfortunately.
I may re-visit this at some point and try to minimise the need for the additional clicks using recaptcha v3 to try to detect the bot visitors, but it still requires bouncing the site visitor around between pages and is not a great user experience.
Upvotes: 1