Reputation: 4519
I have a string with an email address between tags, and I need to extract that email. For example, from here:
var myString = "This email for John <[email protected]> needs to be extracted"
I would like to extract [email protected]
I came up with this regular expression to extract an email address from a string (it doesn't need to validate the email, just a simple regex):
/<\S*@\S*>/gi
It works fine if my string doesn't have other tags, like the previous one. But this regex fails when it finds this scenario:
var myString = "This bold email for John <b><[email protected]><b/> needs to be extracted"
How can I improve my regex to match only the email, ignoring other tags?
P.S.: My end goal is to strip those tags out of the string (only the email tags), so I am open to other suggestions on how to do that as well.
Thanks!
Upvotes: 1
Views: 516
Reputation: 4912
"This bold email for John <b><[email protected]><b/> needs to be extracted".match(/<\w+@\w+\.\w+/>)[0];
// <[email protected]>
\w
matches letters, numbers, and the underscore. Followed by a +
, it matches an unbroken string of such characters.
.
is a special character in regex so it has to be escaped.
So in the regex \w+@\w+\.\w+
, the first \w+
matches john
, then @
matches @
of course. The next \w+
matches mail
, the \.
matches .
, and the last \w+
matches com
.
Upvotes: 0
Reputation: 163207
You can exclude matching the brackets, @ and whitespace chars before and after matching the @
using a negated character class [^<>\s@]
If you want to match 0 or more times, you could use *
instead of +
<[^<>\s@]+@[^<>\s@]+>
Upvotes: 1
Reputation: 1140
It would be difficult to get a perfect regular expression for matching all email addresses and only email addresses, but this should work for your case:
/<[^<\s]*\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b[^>\s]*>/gi
Upvotes: 1