Bruno Monteiro
Bruno Monteiro

Reputation: 4519

Which Regular Expression could I use to find an email address inside tags?

My problem

I have a string with an email address between tags, and I need to extract that email. For example, from here:

var myString = "This email for John <[email protected]> needs to be extracted"

I would like to extract [email protected]

My attempt to fix

I came up with this regular expression to extract an email address from a string (it doesn't need to validate the email, just a simple regex):

/<\S*@\S*>/gi

It works fine if my string doesn't have other tags, like the previous one. But this regex fails when it finds this scenario:

var myString = "This bold email for John <b><[email protected]><b/> needs to be extracted"

How can I improve my regex to match only the email, ignoring other tags?

P.S.: My end goal is to strip those tags out of the string (only the email tags), so I am open to other suggestions on how to do that as well.

Thanks!

Upvotes: 1

Views: 516

Answers (3)

GirkovArpa
GirkovArpa

Reputation: 4912

"This bold email for John <b><[email protected]><b/> needs to be extracted".match(/<\w+@\w+\.\w+/>)[0];
// <[email protected]>

\w matches letters, numbers, and the underscore. Followed by a +, it matches an unbroken string of such characters.

. is a special character in regex so it has to be escaped.

So in the regex \w+@\w+\.\w+, the first \w+ matches john, then @ matches @ of course. The next \w+ matches mail, the \. matches ., and the last \w+ matches com.

Upvotes: 0

The fourth bird
The fourth bird

Reputation: 163207

You can exclude matching the brackets, @ and whitespace chars before and after matching the @ using a negated character class [^<>\s@]

If you want to match 0 or more times, you could use * instead of +

<[^<>\s@]+@[^<>\s@]+>

Regex demo

Upvotes: 1

zeterain
zeterain

Reputation: 1140

It would be difficult to get a perfect regular expression for matching all email addresses and only email addresses, but this should work for your case:

/<[^<\s]*\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,}\b[^>\s]*>/gi

Upvotes: 1

Related Questions