Reputation: 63
I am trying to create a regex pattern to get the email address only after the word "Sender".
Below is example input:
Recip: [email protected]
Subject: Report results (Gd)
Headers: Received: from daem.com (unknown [127.1.1.1])
Date: Sat, 13 Feb 2021 13:11:42 +0000 (GMT)
From: Tavon Lo <[email protected]>
Recip: [email protected]
Subject: Report results (Gd1)
Headers: Received: from daem2.com (unknown [127.1.1.1])
Date: Sat, 14 Feb 2021 13:11:42 +0000 (GMT)
From: Tavon Lo <[email protected]>
Sender: [email protected]
Recipient: [email protected]
So, the only email address that should be in the output is [email protected]
Below is my regex pattern:
(?m)^Sender:([^<>@]+@[^<>]+)
This matches the following:
[email protected]
Recipient: [email protected]
See regex demo https://regex101.com/r/qRLrAW/1
I only want [email protected]. I am new to regex patterns so this is probably an easy fix but I have been stuck. Any ideas or suggestions as how to fix the regex pattern to accommodate my goal?
Upvotes: 1
Views: 61
Reputation: 163287
The catch here is that you have to exclude matching newlines by adding them to the negated character class.
You can also turn the match into a positive lookbehind:
(?m)(?<=^Sender: )[^<>@\n\r]+@[^<>\r\n]+
If the email address can also not contain spaces, you can use \s
instead of \r\n
(?m)(?<=^Sender: )[^<>@\s]+@[^<>\s]+
The pattern matches:
(?m)
Inline modifier for multiline(?<=^Sender: )
Assert Sender: at the left at the start of the string[^<>@\s]+@[^<>\s]+
Match an email like pattern excluding spaces and newlinesJust as an example using the PyPi regex module you might also use \K
to get the match only.
Upvotes: 1
Reputation: 65
I think this expression is will useful for where first part will remove Sender expression where . and + will select email area
(?<=Sender: ).+
Upvotes: 0
Reputation: 154
It's because [^<>]+
matches \n
as well, so it will go over the end of the line to the next line.
You need to add a \n
to your negated character classes, as Wiktor Stribiżew did in his answer.
Upvotes: 1
Reputation: 626794
You can use
(?m)^Sender:[^\S\r\n]*([^<>@\n\r]+@[^<>\n\r]+)
See the regex demo.
Details:
(?m)^
- start of a lineSender:
- a literal string[^\S\r\n]*
- zero or more whitespaces other than CR and LF([^<>@\n\r]+@[^<>\n\r]+)
- Group 1: one or more chars other than <
, >
, @
, CR and LF, @
and one or more chars other than <
, >
, @
, CR and LF.Upvotes: 2