Reputation: 954
I'm trying to read bounced e-mails by connecting via PHP to a IMAP account and fetching all e-mails. I'm looking to retrieve the "Diagnostic-Code" message for each e-mail and I wrote the following regex:
/Diagnostic-Code:\s+?(.*)/i
The message that I'm trying to parse is this:
Diagnostic-Code: smtp; 550-5.1.1 The email account that you tried to reach does
not exist. Please try 550-5.1.1 double-checking the recipient's email
address for typos or 550-5.1.1 unnecessary spaces. Learn more at 550 5.1.1
https://support.google.com/mail/?p=NoSuchUser 63si4621095ybi.465 - gsmtp
The regex works partly meaning it only fetches the first row of text. I want to be able to fetch the entire message, so all the four rows of text.
Is it possible to update the expression to do this matching?
Thanks.
Upvotes: 1
Views: 497
Reputation: 16817
/Diagnostic-Code:\s(.*\n(?:(?!--).*\n)*)/i
.*\n
matches first line including trailing newline(?:(?!--).*\n)*
matches subsquent lines that don't begin "--"Upvotes: 2
Reputation: 163517
If there can be multiple messages starting with Diagnostic-Code:
you could use:
^Diagnostic-Code:\K.*(?:\R(?!Diagnostic-Code:).*)*
See the regex demo | Php demo
Explanation
^
Start of the stringDiagnostic-Code:
Match literally\K.*
Forget what was matched and follow the rest of the string(?:
Non capturin group
\R(?!Diagnostic-Code:).*
Match unicode newline sequence followed by a negative lookahead to check what follows is not !Diagnostic-Code:
. If that is the case then match the whole string)*
Close non caputuring group and repeat 0+ timesUpvotes: 2
Reputation: 1501
Add the s
flag:
/Diagnostic-Code:\s+?(.*)/si
From this question:
In PHP... [t]he s at the end causes the dot to match all characters including newlines.
This will allow your regex to match the whole thing (see this regex101). Just remember to add some way to end it if you have more text after that.
Upvotes: 0