Reputation: 2549
I am trying to make a RegEx that can match the domain portion of an email address. Right now I have to use two of them, one that gets all the email addresses and then another that matches the domain, but I'm still having issues.
Right now the code I have is this:
var email_ex = /[a-zA-Z0-9]+(?:(\.|_)[A-Za-z0-9!#$%&'*+/=?^`{|}~-]+)*@(?!([a-zA-Z0-9]*\.[a-zA-Z0-9]*\.[a-zA-Z0-9]*\.))(?:[A-Za-z0-9](?:[a-zA-Z0-9-]*[A-Za-z0-9])?\.)+[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?/ig; // Match all email addresses on page
email_ex = new RegExp(email_ex);
var domain_ex = /[a-zA-Z0-9\-\.]+\.(com|org|net|mil|edu|COM|ORG|NET|MIL|EDU|CO\.UK|AU|LI|LY|IT|IO)/ig // Match all domains
domain_ex = new RegExp(domain_ex);
var match = document.body.innerText; // Location to pull our text from. In this case it's the whole body
match = match.match(email_ex); // Run the RegExp on the body's textContent
I'd rather not have to have a list of TLD's, but I haven't been able to find an expression good enough
Upvotes: 0
Views: 294
Reputation: 7795
+1 for @strah, the answer works great, but for this example "@example.domain" the return is "example.domain" where, in my opinion, should be null as it is not a valid email.
If you want to be extra strict about the email format, you can do as follows:
var r = /[^\s]+@([^\s]+)/;
r.exec("[email protected]")[1]; //outputs: testing.domain
r.exec("@testing.domain")[1]; //outputs: null
Upvotes: 1
Reputation:
You should be able to combine finding emails, and capturing the
domain part in a single operation and with a single regex.
Using a regex from the html5 specs as an example, but use yours
and just insert the capture group.
# http://www.w3.org/TR/html5/forms.html#valid-e-mail-address
# /[a-zA-Z0-9.!#$%&'*+\/=?^_`{|}~-]+@([a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*)/
[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+
@
( # (1 start)
[a-zA-Z0-9]
(?:
[a-zA-Z0-9-]{0,61}
[a-zA-Z0-9]
)?
(?:
\.
[a-zA-Z0-9]
(?:
[a-zA-Z0-9-]{0,61}
[a-zA-Z0-9]
)?
)*
) # (1 end)
Upvotes: 0
Reputation: 31
If you don't want an Regex that finds a valid e-mail-adresse because u can predetermant that you have one (and if e-mail-adresses are one webpages they are mostly valid) u can use this:
Domain can't contain @'s for this u can consume all characters till the last @
(.*)@(.*)
and you can be sure u have your domain in the second group
Upvotes: 1
Reputation: 73
I agree you should not have a list of TLDs. Your regex is already missing many, and this is going to become a very long list as generic TLDs become more common. This should get you pretty close:
(?<=@)(?:[a-zA-Z0-9][-a-zA-Z0-9]*[a-zA-Z0-9]\.)+[a-zA-Z0-9]{2,}
Or commented:
(?<=@) (?# Check it is preceeded with @ )
(?: (?# start of subdomain block )
[a-zA-Z0-9][-a-zA-Z0-9]*[a-zA-Z0-9] (?# subdomain )
\.)+ (?# end of subdomain, including dot, repeats )
[a-zA-Z0-9]{2,} (?# TLD )
Upvotes: 0
Reputation: 6732
The simplest RegExp: /@([^\s]*)/
var email = "[email protected]";
var domain = email.match(/@([^\s]*)/)[1];
Upvotes: 4