Reputation: 1917
I am looking for a regular expression (or other method if there is such a thing) for detecting bounce email messages. So far I have been going through our unattended mail box and adding strings that I find into a regex. I figured someone would have something that is already complete rather than me re-inventing the wheel.
Here is an example of what I have so far:
/reason: 550|permanent fatal errors|Error 550|Action: Failed|Mailbox does not exist|Delivery to the following recipients failed/i
Upvotes: 1
Views: 2587
Reputation: 1415
This works for me and covers pretty much all hard bounces. This is Perl, but you can roll your own using this Regex pretty safely.
my $content = 'EMAIL MESSAGE HEADER AND BODY';
if (
$content =~ m/Status: 5\.\d\.\d/i || # Any 5xx error
$content =~ m/Action: Failed/i ||
$content =~ m/Reason: 5\.\d\.\d/i || # Any 5xx error
$content =~ m/MAILER-DAEMON/i ||
$content =~ m/Mailbox does not exist/i ||
$content =~ m/No Such User/i ||
$content =~ m/Delivery to the following recipients failed/i ||
$content =~ m/Recipient address rejected/i ||
$content =~ m/Host or domain name not found/i ||
$content =~ m/mailbox unavailable/i
){
# Extract email address from FINAL-RECIPIENT header:
$content =~ s/^.*?final-recipient:\s?rfc822;?\s?([^\n]+).*?$/$1/is;
}
Upvotes: 1
Reputation: 20800
Generate an unique Return-path: email address for each recipient email. Have a catch-all account on that POP3 server and match them. Basically this is VERP.
Upvotes: 1
Reputation: 29240
You're probably better off looking at the full headers for some bounced messages and identifying common elements in the X headers that the server may have included. This is going to get you a lot less false-positives than subject line parsing.
Upvotes: 1
Reputation: 726
It may be overkill for your case, but the most accurate solution is probably to use a spam filtering tool: they all need to be able to handle bounces gracefully, and they will have put a lot of effort into reducing false positives.
I would suggest SpamAssassin, personally. It is packaged as a perl module with a command-line interface "spamassassin" that can probably be coerced to do what you need it to. The bounce message rule is called (unsurprisingly) BOUNCE_MESSAGE. It is, unfortunately, not as simple as a regular expression you can copy.
Upvotes: 1
Reputation: 15073
Email servers are too varied for this to work 100%, but you might have better luck if you were looking in the headers of the message, instead of it's body, as the headers are meant to be machine readable, unlike the body.
I'd start by looking for any headers with 'error' in them.
Upvotes: 1