Reputation: 5247
I'm trying to use a regular expression to match the subject of a bounced email by looking in the header. I need to extract "Membership Activation" from this email header:
Received: from DOMAIN.mydomain.com (UnknownHost [127.0.0.1]) by DOMAIN.mydomain.net with SMTP;
Fri, 6 Sep 2013 10:34:07 -0600
Date: Fri, 6 Sep 2013 10:34:07 -0600 (MDT)
From: "MyDomain.com"
To: [email protected]
Message-ID: <[email protected]>
Subject: Membership Activation
MIME-Version: 1.0
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: 7bit
I tried:
^Subject: (.+)$
But ^ and $ don't work because these are CRLFs. Then I tried:
[\r\n]+Subject: ((.[^\r\n])+)
But I'm not getting the last "n" of "Membership Activation" in group 2 of the result. I'm not sure if my negation of the CRLF is correct.
Any ideas?
Upvotes: 1
Views: 4310
Reputation: 19423
Your regular expression is fine, the problem is that the start-of-line ^
and end-of-line $
anchors match only at the beginning and end of entire string by default.
This can be fixed easily by using a special modifier in your regular expression definition that makes ^
and $
match at the start and end of each line instead of start and end of entire string or input. This modifier is language or tool dependent so you have to look up the documentation of the tool or language you are using to figure out what is the modifier.
For example in PHP:
/^Subject: (.+)$/im
^
Notice the m modifier which makes ^ and $ match at the start and end of each line
in Perl, the same as PHP:
/^Subject: (.+)$/im
In Javascript, the same as PHP:
/^Subject: (.+)$/im
In Python pass the following string to the regex constructor or a method that accepts a regular expression string:
r"(?m)^Subject: (.+)$"
In Java, the same as Python:
"(?m)^Subject: (.+)$"
In .NET, for every method that deals with regular expressions, there is an overload which accepts a RegexOptions
enumeration that turn multi-line mode on:
RegexOptions.Mutliline
Edit: Apparently you are using ColdFusion so if all of the above doesn't work with you then try the following expression:
[\s\S]+Subject: (.+)
but it is not as efficient as the previous options.
Upvotes: 1
Reputation: 1391
Try: [\r\n]+Subject: (([^\r\n])+)
I'm getting the last 'n' with that.
Upvotes: 1