Reputation: 1363
How can I extract the name and email from a string, where emails are separated by commas.
The regex below works great for individual emails, but not for emails within a string.
(?:"?([^"]*)"?\s)?(?:<?(.+@[^>]+)>?)
Note the comma within the name as well.
[email protected], John <[email protected]>, John D, A <[email protected]>, "John Doe , Yen" <[email protected]>
Output:
Name: null
Email: [email protected]
Name: John
Email: [email protected]
Name: John D, A
Email: [email protected]
Name: John Doe , Yen
Email: [email protected]
Upvotes: 4
Views: 1584
Reputation: 70732
It's hard to tell if the data will change or remain the same, but here's my attempt:
var re = /(?:"?([A-Z][^<"]+)"?\s*)?<?([^>\s,]+)/g;
while (m = re.exec(str)) {
if(m[1]) { m[1] = m[1].trim() }
console.log("Name: " + m[1]);
console.log("Email: " + m[2]);
}
Upvotes: 3
Reputation: 14668
Here is one possible answer:
(?:^|, *)(?![^",]+")(?:((?=[^"<]+@)|(?![^"<]+@)"?(?<name>[^"<]*)"? *))<?(?<email>[^,>]*)>?
This is using ruby regexes, and uses forward matches to determine if an entry has a name.
(?:^|, *)
: start at the front of the string, or after a , and a number of spaces(?![^",]+")
: negative lookahead, abort match if there are some characters and then a "
. This stops commas from starting matches inside strings.(?:((?=[^"<]+@)|(?![^"<]+@)"?(?<name>[^"<]*)"? *))
: matching the name:
(?=[^"<]+@)
if a @ occurs before a quote or open brace, it is just a email address without name, so do no match(?![^"<]+@)"?(?<name>[^"<]*)"? *)
: otherwise, match the name (skipping the open and close quote if they are present<?(?<email>[^,>]*)>?
: match the email.Note that for a real job, this would be a terrible approach. The regex is near incomprehensible, not to mention fragile. It also isn't complete, eg what happens if you can escape quotes inside the name?
I would write a dedicated parser for this if you really need it. If you are just trying to extract some data though, the regex may be good enough.
Upvotes: 0