Cosmin
Cosmin

Reputation: 954

RegEx for parsing "Diagnostic-Code" in a bounced e-mail

I'm trying to read bounced e-mails by connecting via PHP to a IMAP account and fetching all e-mails. I'm looking to retrieve the "Diagnostic-Code" message for each e-mail and I wrote the following regex:

/Diagnostic-Code:\s+?(.*)/i

The message that I'm trying to parse is this:

Diagnostic-Code: smtp; 550-5.1.1 The email account that you tried to reach does
    not exist. Please try 550-5.1.1 double-checking the recipient's email
    address for typos or 550-5.1.1 unnecessary spaces. Learn more at 550 5.1.1
    https://support.google.com/mail/?p=NoSuchUser 63si4621095ybi.465 - gsmtp

The regex works partly meaning it only fetches the first row of text. I want to be able to fetch the entire message, so all the four rows of text.

Is it possible to update the expression to do this matching?

Thanks.

Upvotes: 1

Views: 497

Answers (3)

jhnc
jhnc

Reputation: 16817

/Diagnostic-Code:\s(.*\n(?:(?!--).*\n)*)/i
  • result will be in capture group 1
  • first .*\n matches first line including trailing newline
  • (?:(?!--).*\n)* matches subsquent lines that don't begin "--"

Upvotes: 2

The fourth bird
The fourth bird

Reputation: 163517

If there can be multiple messages starting with Diagnostic-Code: you could use:

^Diagnostic-Code:\K.*(?:\R(?!Diagnostic-Code:).*)*

See the regex demo | Php demo

Explanation

  • ^ Start of the string
  • Diagnostic-Code: Match literally
  • \K.* Forget what was matched and follow the rest of the string
  • (?: Non capturin group
    • \R(?!Diagnostic-Code:).* Match unicode newline sequence followed by a negative lookahead to check what follows is not !Diagnostic-Code:. If that is the case then match the whole string
  • )* Close non caputuring group and repeat 0+ times

Upvotes: 2

moltarze
moltarze

Reputation: 1501

Add the s flag:

/Diagnostic-Code:\s+?(.*)/si

From this question:

In PHP... [t]he s at the end causes the dot to match all characters including newlines.

This will allow your regex to match the whole thing (see this regex101). Just remember to add some way to end it if you have more text after that.

Upvotes: 0

Related Questions