Redtopia
Redtopia

Reputation: 5247

Need a RegEx to match subject in bounced email header

I'm trying to use a regular expression to match the subject of a bounced email by looking in the header. I need to extract "Membership Activation" from this email header:

Received: from DOMAIN.mydomain.com (UnknownHost [127.0.0.1]) by DOMAIN.mydomain.net with SMTP;
   Fri, 6 Sep 2013 10:34:07 -0600
Date: Fri, 6 Sep 2013 10:34:07 -0600 (MDT)
From: "MyDomain.com" 
To: [email protected]
Message-ID: <[email protected]>
Subject: Membership Activation
MIME-Version: 1.0
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: 7bit

I tried:

^Subject: (.+)$

But ^ and $ don't work because these are CRLFs. Then I tried:

[\r\n]+Subject: ((.[^\r\n])+)

But I'm not getting the last "n" of "Membership Activation" in group 2 of the result. I'm not sure if my negation of the CRLF is correct.

Any ideas?

Upvotes: 1

Views: 4310

Answers (2)

Ibrahim Najjar
Ibrahim Najjar

Reputation: 19423

Your regular expression is fine, the problem is that the start-of-line ^ and end-of-line $ anchors match only at the beginning and end of entire string by default.

This can be fixed easily by using a special modifier in your regular expression definition that makes ^ and $ match at the start and end of each line instead of start and end of entire string or input. This modifier is language or tool dependent so you have to look up the documentation of the tool or language you are using to figure out what is the modifier.

For example in PHP:

/^Subject: (.+)$/im
                  ^
     Notice the m modifier which makes ^ and $ match at the start and end of each line

in Perl, the same as PHP:

/^Subject: (.+)$/im

In Javascript, the same as PHP:

/^Subject: (.+)$/im

In Python pass the following string to the regex constructor or a method that accepts a regular expression string:

r"(?m)^Subject: (.+)$"

In Java, the same as Python:

"(?m)^Subject: (.+)$"

In .NET, for every method that deals with regular expressions, there is an overload which accepts a RegexOptions enumeration that turn multi-line mode on:

RegexOptions.Mutliline

Regex101 Demo in PHP

Edit: Apparently you are using ColdFusion so if all of the above doesn't work with you then try the following expression:

[\s\S]+Subject: (.+)

but it is not as efficient as the previous options.

Regex101 Demo

Upvotes: 1

waraker
waraker

Reputation: 1391

Try: [\r\n]+Subject: (([^\r\n])+)

I'm getting the last 'n' with that.

Upvotes: 1

Related Questions