dlo
dlo

Reputation: 1583

regex for email address list that may span multiple lines

I want to scan the body of an email for email address lists from forwarded emails, like:

From: John Smith <[email protected]>
To: Jane Smith <[email protected]>, Mary Smith
<[email protected]>
Cc: Ed Smith <[email protected]>
Subject: this is a test

I'm going to use Mail_RFC822::parseAddressList() to fully parse each list (there are a lot of details to get right in there, so I shouldn't try to re-engineer it), but I do want to pluck out the lines to hand off to this function. I have a simple regex that just looks for lines with email addresses, and that works most of the time.

But in the wild, there are sometimes emails like the example above, where the name and address get split onto different lines. If I do it line by line, the top half of the To: line above will fail to parse at all in parseAddressList() because a name without an address is invalid; and the bottom half will parse, but will be missing the name, which was on the previous line.

So I need a regex that can look at multiple lines at once, which complicates things beyond my expertise. An adequate solution would continue to group lines together as long as it keeps finding a basic email pattern ([\w\.\+\-]+@[\w\.\-]+\.[\w\.\-]+ ... doesn't need to be perfect) but without a word-colon combo at the beginning of the line (^\S*:) so that, as in the example above, the Cc: line is a separate match. Thanks in advance for your help.

Upvotes: 3

Views: 275

Answers (2)

Lajos M&#233;sz&#225;ros
Lajos M&#233;sz&#225;ros

Reputation: 3871

How about using the regex s operator, so that . matches newline characters too: /your regex/s ?

Upvotes: 0

instanceof me
instanceof me

Reputation: 39138

You can pre-process the string to remove new lines before < characters and then pass the result to your parseAddressList function.

Something like replacing /(?:\r?\n|\r)\s*</ with <:

$emails = Mail_RFC822::parseAddressList(preg_replace('/(?:\r?\n|\r)\s*</', '<', $emailHeaders));

Upvotes: 1

Related Questions