Reputation: 12900
Over the years I have slowly developed a regular expression that validates most email addresses correctly, assuming they don't use an IP address as the server part.
I use it in several PHP programs, and it works most of the time. However, from time to time I get contacted by someone that is having trouble with a site that uses it, and I end up having to make some adjustment (most recently I realized that I wasn't allowing four-character TLDs).
What is the best regular expression you have or have seen for validating emails?
I've seen several solutions that use functions that use several shorter expressions, but I'd rather have one long complex expression in a simple function instead of several short expression in a more complex function.
Upvotes: 4265
Views: 2735175
Reputation: 11
We get a more practical implementation of RFC 5322 if we omit IP addresses, domain-specific addresses, the syntax using double quotes and square brackets. It will still match 99.99% of all email addresses in actual use today. Refer: https://www.regular-expressions.info/email.html
private String REGEX_EMAIL_CHECK = "^[A-Za-z0-9!#%&'*+/=?^_`{|}~-]+(?:\\.[A-Za-z0-9!#%&'*+/=?^_`{|}~-]+)*@(?:[A-Za-z0-9](?:[A-Za-z0-9-]*[A-Za-z0-9])?\\.)+[A-Za-z0-9](?:[A-Za-z0-9-]*[A-Za-z0-9])?$";
private boolean isValidEmailWithRegex(String email) {
return email.matches(REGEX_EMAIL_CHECK);
}
public void testEmailAddressRegex() {
// valid ?
String[] validEmailAddresses = new String[] {
"[email protected]",
"[email protected]",
"[email protected]",
"[email protected]",
"[email protected]",
"[email protected]",
"[email protected]",
"[email protected]",
"[email protected]",
"[email protected]",
"a%[email protected]",
"[email protected]",
"a~^[email protected]",
"a~&[email protected]",
"a*[email protected]",
"[email protected]",
"a~!#%^&*[email protected]",
"a~!%^&*[email protected]",
"a~%^&*[email protected]",
"a~^&*[email protected]",
"[email protected]",
"[email protected]",
"[email protected]",
"[email protected]",
"[email protected]",
"abcdefghijklmnopqrstuvwxyz-abcdefghijklmnopqrstuvwxyz@example.com",
};
for (String email : validEmailAddresses) {
System.out.println("Testing email address: " + email + " is valid");
assertTrue(isValidEmailWithRegex(email));
}
// Invalid ?
String[] invalidEmailAddresses = new String[] {
"",
" ",
"@",
"username",
"username@example",
"username@example.",
"[email protected].",
"username@example-com.",
"[email protected].",
"username@@example.com",
"username@example@com",
"[email protected]@",
"[email protected]@example",
"[email protected]@example.com",
"[email protected]@example.com.",
"[email protected]@example-com.",
"[email protected]@example.com-.",
"[email protected]@example.com.example",
"[email protected]@example-com.example",
"[email protected]@example.com-example",
};
for (String email : invalidEmailAddresses) {
System.out.println("Testing email address: " + email + " is Invalid");
assertFalse(isValidEmailWithRegex(email));
}
}
Upvotes: 1
Reputation: 2098
For oracle database, version 23+, there is a built-in Usage Domain name email_d
which uses the regex
^([a-zA-Z0-9!#$%&*+=?^_`{|}~-]+(\.[ A-Za-z0-9!#$%&*+=?^_`{|}~-]+)*)@(([ a-zA-Z0-9]([a-zA-Z0-9-]*[a-zA-Z0-9] )?\.)+[a-zA-Z0-9]([a-zA-Z0-9-]*[a-z A-Z0-9])?)$
Of course, this is rather simplistic compared to the other answers, but... it is built-in.
Upvotes: 1
Reputation: 27
This answer largely and directly addresses multiple issues in the currently highest upvoted answer.
This answer also reinterprets and optimizes the regex as used in WebKit for example, for the Email Input Type.
The explicit ordering of certain parts of the expressions and characters/ranges in character classes is in some cases intentional. For example, some parts of the patterns have been intentionally optimized for lower-case alpha, digits, and upper-case alpha in that order, assuming that to cover the most frequent usages, and even if not frequent, then as canonicalized.
First, an attempt at a potentially correct implementation, barring any errata (currently, still under active development and improvement):
Domain part for Intranet/Local:
(?=.{1,254}$)[a-z0-9A-Z](?:[a-z0-9A-Z-]{0,61}[a-z0-9A-Z]|)(?:\.[a-zA-Z]([a-z0-9A-Z-]{0,61}[a-z0-9A-Z]|))*
Full and Expanded RegExp:
^(?:
[-^-~/-9A-Z!#-'*+=?]+(?:\.[-^-~/-9A-Z!#-'*+=?]+)*
|
"
(?:
[!#-[\]-~]
|
\\[ -~\t]
)+
"
)@(?:
(?=.{4,254}$)(?:[a-z0-9A-Z](?:[a-z0-9A-Z-]{0,61}[a-z0-9A-Z]|)\.)+[a-zA-Z][a-z0-9A-Z-]{0,61}[a-z0-9A-Z]
|
\[
(?:25[0-5]|(?:1[0-9]|2[0-4]|[1-9]|)[0-9])(?:\.(?:25[0-5]|(?:1[0-9]|2[0-4]|[1-9]|)[0-9])){3}
|
[a-zA-Z0-9-]*[a-zA-Z0-9]:[!-Z^-~]
\]
)$
For comparison, the original, as seems to have been reposted by @DouglasDaseeco:
^(?:
[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*
|
"
(?:
[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]
|
\\[\x01-\x09\x0b\x0c\x0e-\x7f]
)*
"
)@(?:
(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?
|
\[
(?:(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9]))\.){3}(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])
|
[a-z0-9-]*[a-z0-9]:
(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)
\]
)$
Specification
/^[a-zA-Z0-9.!#$%&'*+\/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/
Optimized: Strict #1
/^[--9^-~A-Z!#-'*+=?]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/
Alternative
/^[--9^-~A-Z!#-'*+=?]{1,64}@(?=.{1,254}$)[a-z0-9A-Z](?:[a-z0-9A-Z-]{0,61}[a-z0-9A-Z]|)(?:\.[a-z0-9A-Z](?:[a-z0-9A-Z-]{0,61}[a-z0-9A-Z]|))*$
Original:
^[a-zA-Z0-9.!#$%&'*+\/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$
Optimized:
^[--9^-~A-Z!#-'*+=?]+@[a-z0-9A-Z](?:[a-z0-9A-Z-]{0,61}[a-z0-9A-Z]|)(?:\.[a-z0-9A-Z](?:[a-z0-9A-Z-]{0,61}[a-z0-9A-Z]|))*$
Alternative:
^[--9^-~A-Z!#-'*+=?]{1,64}@(?=.{1,254}$)[a-z0-9A-Z](?:[a-z0-9A-Z-]{0,61}[a-z0-9A-Z]|)(?:\.[a-z0-9A-Z](?:[a-z0-9A-Z-]{0,61}[a-z0-9A-Z]|))*$
address = mailbox / group
mailbox = name-addr / addr-spec
name-addr = [display-name] angle-addr
angle-addr = [CFWS] "<" addr-spec ">" [CFWS] / obs-angle-addr
group = display-name ":" [group-list] ";" [CFWS]
display-name = phrase
mailbox-list = (mailbox *("," mailbox)) / obs-mbox-list
address-list = (address *("," address)) / obs-addr-list
group-list = mailbox-list / CFWS / obs-group-list
addr-spec = local-part "@" domain
local-part = dot-atom / quoted-string / obs-local-part
domain = dot-atom / domain-literal / obs-domain
domain-literal = [CFWS] "[" *([FWS] dtext) "]" [CFWS]
dtext = %d33-90 / %d94-126 / obs-dtext
; Printable US-ASCII characters not including "[", "]", or "\"
quoted-string = [CFWS] DQUOTE *([FWS] qcontent) [FWS] DQUOTE [CFWS]
qcontent = qtext / quoted-pair
qtext = %d33 / %d35-91 / %d93-126 / obs-qtext
; Printable US-ASCII characters not including "\" or the quote character
dot-atom = [CFWS] dot-atom-text [CFWS]
dot-atom-text = 1*atext *("." atext)
atext = ALPHA / DIGIT / "!" / "#" / "$" / "%" / "&" / "'" / "*" / "+" / "-" / "/" /
"=" / "?" / "^" / "_" / "`" / "{" / "|" / "}" / "~"
dtext = !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ^_`abcdefghijklmnopqrstuvwxyz{|}~
= !-Z^-~
qtext = !#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz{|}~
= !#-[\]-~
atext = abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789!#$%&'*+-/=?^_`{|}~
= -^-~/-9A-Z!#-'*+=?
DIGIT = %x30-39 ; 0-9
= 0-9
= \d
ALPHA = %x41-5A / %x61-7A ; A-Z / a-z
= A-Za-z
VCHAR = %x21-7E ; Visible (printing) characters
= !-~
WSP = SP / HTAB ; White space
= [ \t]
Last part of the IPv4 pattern seems to have been grouped together with the address-literal pattern.
Including but not limited to \x7f which is equal to ASCII 127 or DEL.
RFC 5321 section 4.1.2:
Systems MUST NOT define mailboxes in such a way as to require the use in SMTP of non-ASCII characters (octets with the high order bit set to one) or ASCII "control characters" (decimal value 0-31 and 127). These characters MUST NOT be used in MAIL or RCPT commands or other commands that require mailbox names.
RFC 5321 sections 4.1.2, 4.1.3:
address-literal = "[" ( IPv4-address-literal /
IPv6-address-literal /
General-address-literal ) "]"
IPv4-address-literal = Snum 3("." Snum)
IPv6-address-literal = "IPv6:" IPv6-addr
General-address-literal = Standardized-tag ":" 1*dcontent
Standardized-tag = Ldh-str
; Standardized-tag MUST be specified in a
; Standards-Track RFC and registered with IANA
Ldh-str = *( ALPHA / DIGIT / "-" ) Let-dig
If these solutions are cross-checked and peer-verified, anyone may incorporate this info into the original community-wiki answer, with appropriate credit.
Upvotes: 3
Reputation: 511
if you are looking to check against a simple format: [email protected] or [email protected]
/^[^.\s@]+(\.[^.\s@]+)*@[^.\s@]+(\.[^.\s@]+)+$/
exact format used: ^x(.x)*@x(.x)+$
x
any non-empty string without .
(dot), \s
(white space character) and @
(at) in it
*
repeat zero or more times
+
repeat one or more times
^
start of string
$
end of string
Upvotes: 0
Reputation: 21899
According to the official standard, RFC 2822, a valid email regex is:
(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])
If you want to use it in Java, it's really very easy:
import java.util.regex.*;
class regexSample
{
public static void main(String args[])
{
//Input the string for validation
String email = "[email protected]";
//Set the email pattern string
Pattern p = Pattern.compile(" (?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"
+"(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21\\x23-\\x5b\\x5d-\\x7f]|\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])*\")"
+ "@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21-\\x5a\\x53-\\x7f]|\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])+)\\]");
//Match the given string with the pattern
Matcher m = p.matcher(email);
//Check whether match is found
boolean matchFound = m.matches();
if (matchFound)
System.out.println("Valid Email Id.");
else
System.out.println("Invalid Email Id.");
}
}
Upvotes: 16
Reputation: 19644
I'm still using:
^[A-Za-z0-9._+\-\']+@[A-Za-z0-9.\-]+\.[A-Za-z]{2,}$
But with IPv6 and Unicode coming up, perhaps this is best:
console.log(/^[\p{L}!#-'*+\-/\d=?^-~]+(.[\p{L}!#-'*+\-/\d=?^-~])*@[^@\s]{2,}$/u.test("תה.בועות@😀.fm"))
Gmail allows sequential dots, but Microsoft Exchange Server 2007 refuses them, which follows the most recent standard afaik.
Upvotes: 7
Reputation: 35459
The fully RFC 822 compliant regex is inefficient and obscure because of its length. Fortunately, RFC 822 was superseded twice and the current specification for email addresses is RFC 5322. RFC 5322 leads to a regex that can be understood if studied for a few minutes and is efficient enough for actual use.
One RFC 5322 compliant regex can be found at the top of the page at http://emailregex.com/ but uses the IP address pattern that is floating around the internet with a bug that allows 00
for any of the unsigned byte decimal values in a dot-delimited address, which is illegal. The rest of it appears to be consistent with the RFC 5322 grammar and passes several tests using grep -Po
, including cases domain names, IP addresses, bad ones, and account names with and without quotes.
Correcting the 00
bug in the IP pattern, we obtain a working and fairly fast regex. (Scrape the rendered version, not the markdown, for actual code.)
(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9]))\.){3}(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])
or:
(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9]))\.){3}(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])
Here is diagram of finite state machine for above regexp which is more clear than regexp itself
The more sophisticated patterns in Perl and PCRE (regex library used e.g. in PHP) can correctly parse RFC 5322 without a hitch. Python and C# can do that too, but they use a different syntax from those first two. However, if you are forced to use one of the many less powerful pattern-matching languages, then it’s best to use a real parser.
It's also important to understand that validating it per the RFC tells you absolutely nothing about whether that address actually exists at the supplied domain, or whether the person entering the address is its true owner. People sign others up to mailing lists this way all the time. Fixing that requires a fancier kind of validation that involves sending that address a message that includes a confirmation token meant to be entered on the same web page as was the address.
Confirmation tokens are the only way to know you got the address of the person entering it. This is why most mailing lists now use that mechanism to confirm sign-ups. After all, anybody can put down [email protected]
, and that will even parse as legal, but it isn't likely to be the person at the other end.
For PHP, you should not use the pattern given in Validate an E-Mail Address with PHP, the Right Way from which I quote:
There is some danger that common usage and widespread sloppy coding will establish a de facto standard for e-mail addresses that is more restrictive than the recorded formal standard.
That is no better than all the other non-RFC patterns. It isn’t even smart enough to handle even RFC 822, let alone RFC 5322. This one, however, is.
If you want to get fancy and pedantic, implement a complete state engine. A regular expression can only act as a rudimentary filter. The problem with regular expressions is that telling someone that their perfectly valid e-mail address is invalid (a false positive) because your regular expression can't handle it is just rude and impolite from the user's perspective. A state engine for the purpose can both validate and even correct e-mail addresses that would otherwise be considered invalid as it disassembles the e-mail address according to each RFC. This allows for a potentially more pleasing experience, like
The specified e-mail address 'myemail@address,com' is invalid. Did you mean '[email protected]'?
See also Validating Email Addresses, including the comments. Or Comparing E-mail Address Validating Regular Expressions.
Upvotes: 3477
Reputation: 5292
I did not find any that deals with a top-level domain name, but it should be considered.
So for me the following worked:
[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+(?:[A-Z]{2}AAA|AARP|ABB|ABBOTT|ABOGADO|AC|ACADEMY|ACCENTURE|ACCOUNTANT|ACCOUNTANTS|ACO|ACTIVE|ACTOR|AD|ADAC|ADS|ADULT|AE|AEG|AERO|AF|AFL|AG|AGENCY|AI|AIG|AIRFORCE|AIRTEL|AL|ALIBABA|ALIPAY|ALLFINANZ|ALSACE|AM|AMICA|AMSTERDAM|ANALYTICS|ANDROID|AO|APARTMENTS|APP|APPLE|AQ|AQUARELLE|AR|ARAMCO|ARCHI|ARMY|ARPA|ARTE|AS|ASIA|ASSOCIATES|AT|ATTORNEY|AU|AUCTION|AUDI|AUDIO|AUTHOR|AUTO|AUTOS|AW|AX|AXA|AZ|AZURE|BA|BAIDU|BAND|BANK|BAR|BARCELONA|BARCLAYCARD|BARCLAYS|BARGAINS|BAUHAUS|BAYERN|BB|BBC|BBVA|BCN|BD|BE|BEATS|BEER|BENTLEY|BERLIN|BEST|BET|BF|BG|BH|BHARTI|BI|BIBLE|BID|BIKE|BING|BINGO|BIO|BIZ|BJ|BLACK|BLACKFRIDAY|BLOOMBERG|BLUE|BM|BMS|BMW|BN|BNL|BNPPARIBAS|BO|BOATS|BOEHRINGER|BOM|BOND|BOO|BOOK|BOOTS|BOSCH|BOSTIK|BOT|BOUTIQUE|BR|BRADESCO|BRIDGESTONE|BROADWAY|BROKER|BROTHER|BRUSSELS|BS|BT|BUDAPEST|BUGATTI|BUILD|BUILDERS|BUSINESS|BUY|BUZZ|BV|BW|BY|BZ|BZH|CA|CAB|CAFE|CAL|CALL|CAMERA|CAMP|CANCERRESEARCH|CANON|CAPETOWN|CAPITAL|CAR|CARAVAN|CARDS|CARE|CAREER|CAREERS|CARS|CARTIER|CASA|CASH|CASINO|CAT|CATERING|CBA|CBN|CC|CD|CEB|CENTER|CEO|CERN|CF|CFA|CFD|CG|CH|CHANEL|CHANNEL|CHAT|CHEAP|CHLOE|CHRISTMAS|CHROME|CHURCH|CI|CIPRIANI|CIRCLE|CISCO|CITIC|CITY|CITYEATS|CK|CL|CLAIMS|CLEANING|CLICK|CLINIC|CLINIQUE|CLOTHING|CLOUD|CLUB|CLUBMED|CM|CN|CO|COACH|CODES|COFFEE|COLLEGE|COLOGNE|COM|COMMBANK|COMMUNITY|COMPANY|COMPARE|COMPUTER|COMSEC|CONDOS|CONSTRUCTION|CONSULTING|CONTACT|CONTRACTORS|COOKING|COOL|COOP|CORSICA|COUNTRY|COUPONS|COURSES|CR|CREDIT|CREDITCARD|CREDITUNION|CRICKET|CROWN|CRS|CRUISES|CSC|CU|CUISINELLA|CV|CW|CX|CY|CYMRU|CYOU|CZ|DABUR|DAD|DANCE|DATE|DATING|DATSUN|DAY|DCLK|DE|DEALER|DEALS|DEGREE|DELIVERY|DELL|DELTA|DEMOCRAT|DENTAL|DENTIST|DESI|DESIGN|DEV|DIAMONDS|DIET|DIGITAL|DIRECT|DIRECTORY|DISCOUNT|DJ|DK|DM|DNP|DO|DOCS|DOG|DOHA|DOMAINS|DOOSAN|DOWNLOAD|DRIVE|DUBAI|DURBAN|DVAG|DZ|EARTH|EAT|EC|EDEKA|EDU|EDUCATION|EE|EG|EMAIL|EMERCK|ENERGY|ENGINEER|ENGINEERING|ENTERPRISES|EPSON|EQUIPMENT|ER|ERNI|ES|ESQ|ESTATE|ET|EU|EUROVISION|EUS|EVENTS|EVERBANK|EXCHANGE|EXPERT|EXPOSED|EXPRESS|FAGE|FAIL|FAIRWINDS|FAITH|FAMILY|FAN|FANS|FARM|FASHION|FAST|FEEDBACK|FERRERO|FI|FILM|FINAL|FINANCE|FINANCIAL|FIRESTONE|FIRMDALE|FISH|FISHING|FIT|FITNESS|FJ|FK|FLIGHTS|FLORIST|FLOWERS|FLSMIDTH|FLY|FM|FO|FOO|FOOTBALL|FORD|FOREX|FORSALE|FORUM|FOUNDATION|FOX|FR|FRESENIUS|FRL|FROGANS|FUND|FURNITURE|FUTBOL|FYI|GA|GAL|GALLERY|GAME|GARDEN|GB|GBIZ|GD|GDN|GE|GEA|GENT|GENTING|GF|GG|GGEE|GH|GI|GIFT|GIFTS|GIVES|GIVING|GL|GLASS|GLE|GLOBAL|GLOBO|GM|GMAIL|GMO|GMX|GN|GOLD|GOLDPOINT|GOLF|GOO|GOOG|GOOGLE|GOP|GOT|GOV|GP|GQ|GR|GRAINGER|GRAPHICS|GRATIS|GREEN|GRIPE|GROUP|GS|GT|GU|GUCCI|GUGE|GUIDE|GUITARS|GURU|GW|GY|HAMBURG|HANGOUT|HAUS|HEALTH|HEALTHCARE|HELP|HELSINKI|HERE|HERMES|HIPHOP|HITACHI|HIV|HK|HM|HN|HOCKEY|HOLDINGS|HOLIDAY|HOMEDEPOT|HOMES|HONDA|HORSE|HOST|HOSTING|HOTELES|HOTMAIL|HOUSE|HOW|HR|HSBC|HT|HU|HYUNDAI|IBM|ICBC|ICE|ICU|ID|IE|IFM|IINET|IL|IM|IMMO|IMMOBILIEN|IN|INDUSTRIES|INFINITI|INFO|ING|INK|INSTITUTE|INSURANCE|INSURE|INT|INTERNATIONAL|INVESTMENTS|IO|IPIRANGA|IQ|IR|IRISH|IS|ISELECT|IST|ISTANBUL|IT|ITAU|IWC|JAGUAR|JAVA|JCB|JE|JETZT|JEWELRY|JLC|JLL|JM|JMP|JO|JOBS|JOBURG|JOT|JOY|JP|JPRS|JUEGOS|KAUFEN|KDDI|KE|KFH|KG|KH|KI|KIA|KIM|KINDER|KITCHEN|KIWI|KM|KN|KOELN|KOMATSU|KP|KPN|KR|KRD|KRED|KW|KY|KYOTO|KZ|LA|LACAIXA|LAMBORGHINI|LAMER|LANCASTER|LAND|LANDROVER|LANXESS|LASALLE|LAT|LATROBE|LAW|LAWYER|LB|LC|LDS|LEASE|LECLERC|LEGAL|LEXUS|LGBT|LI|LIAISON|LIDL|LIFE|LIFEINSURANCE|LIFESTYLE|LIGHTING|LIKE|LIMITED|LIMO|LINCOLN|LINDE|LINK|LIVE|LIVING|LIXIL|LK|LOAN|LOANS|LOL|LONDON|LOTTE|LOTTO|LOVE|LR|LS|LT|LTD|LTDA|LU|LUPIN|LUXE|LUXURY|LV|LY|MA|MADRID|MAIF|MAISON|MAKEUP|MAN|MANAGEMENT|MANGO|MARKET|MARKETING|MARKETS|MARRIOTT|MBA|MC|MD|ME|MED|MEDIA|MEET|MELBOURNE|MEME|MEMORIAL|MEN|MENU|MEO|MG|MH|MIAMI|MICROSOFT|MIL|MINI|MK|ML|MM|MMA|MN|MO|MOBI|MOBILY|MODA|MOE|MOI|MOM|MONASH|MONEY|MONTBLANC|MORMON|MORTGAGE|MOSCOW|MOTORCYCLES|MOV|MOVIE|MOVISTAR|MP|MQ|MR|MS|MT|MTN|MTPC|MTR|MU|MUSEUM|MUTUELLE|MV|MW|MX|MY|MZ|NA|NADEX|NAGOYA|NAME|NAVY|NC|NE|NEC|NET|NETBANK|NETWORK|NEUSTAR|NEW|NEWS|NEXUS|NF|NG|NGO|NHK|NI|NICO|NINJA|NISSAN|NL|NO|NOKIA|NORTON|NOWRUZ|NP|NR|NRA|NRW|NTT|NU|NYC|NZ|OBI|OFFICE|OKINAWA|OM|OMEGA|ONE|ONG|ONL|ONLINE|OOO|ORACLE|ORANGE|ORG|ORGANIC|ORIGINS|OSAKA|OTSUKA|OVH|PA|PAGE|PAMPEREDCHEF|PANERAI|PARIS|PARS|PARTNERS|PARTS|PARTY|PE|PET|PF|PG|PH|PHARMACY|PHILIPS|PHOTO|PHOTOGRAPHY|PHOTOS|PHYSIO|PIAGET|PICS|PICTET|PICTURES|PID|PIN|PING|PINK|PIZZA|PK|PL|PLACE|PLAY|PLAYSTATION|PLUMBING|PLUS|PM|PN|POHL|POKER|PORN|POST|PR|PRAXI|PRESS|PRO|PROD|PRODUCTIONS|PROF|PROMO|PROPERTIES|PROPERTY|PROTECTION|PS|PT|PUB|PW|PY|QA|QPON|QUEBEC|RACING|RE|READ|REALTOR|REALTY|RECIPES|RED|REDSTONE|REDUMBRELLA|REHAB|REISE|REISEN|REIT|REN|RENT|RENTALS|REPAIR|REPORT|REPUBLICAN|REST|RESTAURANT|REVIEW|REVIEWS|REXROTH|RICH|RICOH|RIO|RIP|RO|ROCHER|ROCKS|RODEO|ROOM|RS|RSVP|RU|RUHR|RUN|RW|RWE|RYUKYU|SA|SAARLAND|SAFE|SAFETY|SAKURA|SALE|SALON|SAMSUNG|SANDVIK|SANDVIKCOROMANT|SANOFI|SAP|SAPO|SARL|SAS|SAXO|SB|SBS|SC|SCA|SCB|SCHAEFFLER|SCHMIDT|SCHOLARSHIPS|SCHOOL|SCHULE|SCHWARZ|SCIENCE|SCOR|SCOT|SD|SE|SEAT|SECURITY|SEEK|SELECT|SENER|SERVICES|SEVEN|SEW|SEX|SEXY|SFR|SG|SH|SHARP|SHELL|SHIA|SHIKSHA|SHOES|SHOW|SHRIRAM|SI|SINGLES|SITE|SJ|SK|SKI|SKIN|SKY|SKYPE|SL|SM|SMILE|SN|SNCF|SO|SOCCER|SOCIAL|SOFTBANK|SOFTWARE|SOHU|SOLAR|SOLUTIONS|SONY|SOY|SPACE|SPIEGEL|SPREADBETTING|SR|SRL|ST|STADA|STAR|STARHUB|STATEFARM|STATOIL|STC|STCGROUP|STOCKHOLM|STORAGE|STUDIO|STUDY|STYLE|SU|SUCKS|SUPPLIES|SUPPLY|SUPPORT|SURF|SURGERY|SUZUKI|SV|SWATCH|SWISS|SX|SY|SYDNEY|SYMANTEC|SYSTEMS|SZ|TAB|TAIPEI|TAOBAO|TATAMOTORS|TATAR|TATTOO|TAX|TAXI|TC|TCI|TD|TEAM|TECH|TECHNOLOGY|TEL|TELEFONICA|TEMASEK|TENNIS|TF|TG|TH|THD|THEATER|THEATRE|TICKETS|TIENDA|TIFFANY|TIPS|TIRES|TIROL|TJ|TK|TL|TM|TMALL|TN|TO|TODAY|TOKYO|TOOLS|TOP|TORAY|TOSHIBA|TOURS|TOWN|TOYOTA|TOYS|TR|TRADE|TRADING|TRAINING|TRAVEL|TRAVELERS|TRAVELERSINSURANCE|TRUST|TRV|TT|TUBE|TUI|TUSHU|TV|TW|TZ|UA|UBS|UG|UK|UNIVERSITY|UNO|UOL|US|UY|UZ|VA|VACATIONS|VANA|VC|VE|VEGAS|VENTURES|VERISIGN|VERSICHERUNG|VET|VG|VI|VIAJES|VIDEO|VILLAS|VIN|VIP|VIRGIN|VISION|VISTA|VISTAPRINT|VIVA|VLAANDEREN|VN|VODKA|VOLKSWAGEN|VOTE|VOTING|VOTO|VOYAGE|VU|WALES|WALTER|WANG|WANGGOU|WATCH|WATCHES|WEATHER|WEBCAM|WEBER|WEBSITE|WED|WEDDING|WEIR|WF|WHOSWHO|WIEN|WIKI|WILLIAMHILL|WIN|WINDOWS|WINE|WME|WORK|WORKS|WORLD|WS|WTC|WTF|XBOX|XEROX|XIN|XN--11B4C3D|XN--1QQW23A|XN--30RR7Y|XN--3BST00M|XN--3DS443G|XN--3E0B707E|XN--3PXU8K|XN--42C2D9A|XN--45BRJ9C|XN--45Q11C|XN--4GBRIM|XN--55QW42G|XN--55QX5D|XN--6FRZ82G|XN--6QQ986B3XL|XN--80ADXHKS|XN--80AO21A|XN--80ASEHDB|XN--80ASWG|XN--90A3AC|XN--90AIS|XN--9DBQ2A|XN--9ET52U|XN--B4W605FERD|XN--C1AVG|XN--C2BR7G|XN--CG4BKI|XN--CLCHC0EA0B2G2A9GCD|XN--CZR694B|XN--CZRS0T|XN--CZRU2D|XN--D1ACJ3B|XN--D1ALF|XN--ECKVDTC9D|XN--EFVY88H|XN--ESTV75G|XN--FHBEI|XN--FIQ228C5HS|XN--FIQ64B|XN--FIQS8S|XN--FIQZ9S|XN--FJQ720A|XN--FLW351E|XN--FPCRJ9C3D|XN--FZC2C9E2C|XN--G2XX48C|XN--GECRJ9C|XN--H2BRJ9C|XN--HXT814E|XN--I1B6B1A6A2E|XN--IMR513N|XN--IO0A7I|XN--J1AEF|XN--J1AMH|XN--J6W193G|XN--JLQ61U9W7B|XN--KCRX77D1X4A|XN--KPRW13D|XN--KPRY57D|XN--KPU716F|XN--KPUT3I|XN--L1ACC|XN--LGBBAT1AD8J|XN--MGB9AWBF|XN--MGBA3A3EJT|XN--MGBA3A4F16A|XN--MGBAAM7A8H|XN--MGBAB2BD|XN--MGBAYH7GPA|XN--MGBB9FBPOB|XN--MGBBH1A71E|XN--MGBC0A9AZCG|XN--MGBERP4A5D4AR|XN--MGBPL2FH|XN--MGBT3DHD|XN--MGBTX2B|XN--MGBX4CD0AB|XN--MK1BU44C|XN--MXTQ1M|XN--NGBC5AZD|XN--NGBE9E0A|XN--NODE|XN--NQV7F|XN--NQV7FS00EMA|XN--NYQY26A|XN--O3CW4H|XN--OGBPF8FL|XN--P1ACF|XN--P1AI|XN--PBT977C|XN--PGBS0DH|XN--PSSY2U|XN--Q9JYB4C|XN--QCKA1PMC|XN--QXAM|XN--RHQV96G|XN--S9BRJ9C|XN--SES554G|XN--T60B56A|XN--TCKWE|XN--UNUP4Y|XN--VERMGENSBERATER-CTB|XN--VERMGENSBERATUNG-PWB|XN--VHQUV|XN--VUQ861B|XN--WGBH1C|XN--WGBL6A|XN--XHQ521B|XN--XKC2AL3HYE2A|XN--XKC2DL3A5EE0H|XN--Y9A3AQ|XN--YFRO4I67O|XN--YGBI2AMMX|XN--ZFR164B|XPERIA|XXX|XYZ|YACHTS|YAMAXUN|YANDEX|YE|YODOBASHI|YOGA|YOKOHAMA|YOUTUBE|YT|ZA|ZARA|ZERO|ZIP|ZM|ZONE|ZUERICH|ZW)\b
That easily discarded emails like [email protected], [email protected]
, etc.
The domain name can be further edited if needed, e.g., specific country domain, etc.
Another list of top level domains that updates frequently.
Upvotes: 2
Reputation: 2356
I'd like to propose my approach which is relatively simple while ensuring proper email structure and restricting forbidden characters. Valid for latin characters.
/^(?![\w\.@]*\.\.)(?![\w\.@]*\.@)(?![\w\.]*@\.)\w+[\w\.]*@[\w\.]+\.\w{2,}$/
Upvotes: 1
Reputation: 71
The question title is fairly generic, however the body of the question indicates that it is about the PHP based solution. Will try to address both.
Generically speaking, for all programming languages: Typically, validating" an e-mail address with a reg-ex is something that any internet based service provider should desist from. The possibilities of kinds of domain names and e-mail addresses have increased so much in terms of variety, any attempt at validation, which is not well thought may end up denying some valid users into your system. To avoid this, one of the best ways is to send an email to the user and verify it being received. The good folks at "Universal Acceptance Steering Group" have compiled a languagewise list of libraries which are found to be compliant/non-compliant with various parameters involving validations vis-a-vis Internationalized Domain Names and Internationalized Email addresses. Please find the links to those documents over here and here.
Speaking specifically of PHP:
There is one good library available in PHP i.e. EmailValidator. It is an email address validator that includes many validation methods such as DNS validation. The validator specifically recommended is called RFCValidator and validates email addresses against several RFCs. It has good compliance when it comes to being inclusive towards IDNs and Internationalized Email addresses.
Upvotes: 0
Reputation: 3753
I use multi-step validation. As there isn't any perfect way to validate an email address, a perfect one can't be made, but at least you can notify the user he/she is doing something wrong - here is my approach:
I first validate with the very basic regex which just checks if the email contains exactly one @ sign and it is not blank before or after that sign. e.g. /^[^@\s]+@[^@\s]+$/
if the first validator does not pass (and for most addresses it should although it is not perfect), then warn the user the email is invalid and do not allow him/her to continue with the input
if it passes, then validate against a more strict regex - something which might disallow valid emails. If it does not pass, the user is warned about a possible error, but the user is allowed to continue. Unlike step (1) where the user is not allowed to continue because it is an obvious error.
So in other words, the first liberal validation is just to strip obvious errors and it is treated as "error". People type a blank address, address without @ sign and so on. This should be treated as an error. The second one is more strict, but it is treated as a "warning" and the user is allowed to continue with the input, but warned to at least examine if he/she entered a valid entry. The key here is in the error/warning approach - the error being something that can't under 99.99% circumstances be a valid email.
Of course, you can adjust what makes the first regex more liberal and the second one more strict.
Depending on what you need, the above approach might work for you.
Upvotes: 8
Reputation: 3574
I always use the below regular expression to validate the email address. It covers all formats of email addresses based on English language characters.
"\A(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?)\Z";
Given below is a C# example:
Add the assembly reference:
using System.Text.RegularExpressions;
and use the below method to pass the email address and get a boolean in return
private bool IsValidEmail(string email) {
bool isValid = false;
const string pattern = @"\A(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?)\Z";
isValid = email != "" && Regex.IsMatch(email, pattern);
// Same above approach in multiple lines
//
//if (!email) {
// isValid = false;
//} else {
// // email param contains a value; Pass it to the isMatch method
// isValid = Regex.IsMatch(email, pattern);
//}
return isValid;
}
This method validates the email string passed in the parameter. It will return false for all cases where param is null, empty string, undefined or the param value is not a valid email address. It will only return true when the param contains a valid email address string.
Upvotes: 4
Reputation: 1820
The regular expression that I use:
[\w-+]+([.][\w]+)?@[\w-+]+([.][a-z]{2,})+
Upvotes: -3
Reputation: 163
If you need a simple form to validate, you can use the answer of https://regexr.com/3e48o
^[\w-\.]+@([\w-]+\.)+[\w-]{2,4}$
let r = new RegExp(String.raw `^[\w-\.]+@([\w-]+\.)+[\w-]{2,4}$`);
//should be true
console.log(r.test('[email protected]'));
console.log(r.test('[email protected]'));
console.log(r.test('[email protected]'));
//should be false
console.log(r.test('@domain.tld'));
console.log(r.test('[email protected]'));
console.log(r.test('name@domain.'));
console.log(r.test('namedomain.tld'));
console.log(r.test(''));
//now that basic client-side validation is done, send a token from the server side to validate the user actually has access to the email
Upvotes: 1
Reputation: 5261
Although very detailed answers are already added, I think those are complex enough for a developer who is just looking for a simple method to validate an email address or to get all email addresses from a string in Java.
public static boolean isEmailValid(@NonNull String email) {
return android.util.Patterns.EMAIL_ADDRESS.matcher(email).matches();
}
As per the regular expression is concerned, I always use this regular expression, which works for my problems.
"[A-Z0-9a-z._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}"
If you are looking to find all email addresses from a string by matching the email regular expression. You can find a method at this link.
Upvotes: 4
Reputation: 8019
According to RFC 2821 and RFC 2822, the local-part of an email addresses may use any of these ASCII characters:
Matches:
Non-Matches:
For one that is RFC 2821 and 2822 compliant, you can use:
^((([!#$%&'*+\-/=?^_`{|}~\w])|([!#$%&'*+\-/=?^_`{|}~\w][!#$%&'*+\-/=?^_`{|}~\.\w]{0,}[!#$%&'*+\-/=?^_`{|}~\w]))[@]\w+([-.]\w+)*\.\w+([-.]\w+)*)$
Email - RFC 2821, 2822 Compliant
Upvotes: 4
Reputation: 43
Writing a regular expression for all the things will take a lot of effort. Instead, you can use pyIsEmail package.
Below text is taken from pyIsEmail website.
pyIsEmail is a no-nonsense approach for checking whether that user-supplied email address could be real.
Regular expressions are cheap to write, but often require maintenance when new top-level domains come out or don’t conform to email addressing features that come back into vogue. pyIsEmail allows you to validate an email address – and even check the domain, if you wish – with one simple call, making your code more readable and faster to write. When you want to know why an email address doesn’t validate, they even provide you with a diagnosis.
For the simplest usage, import and use the is_email function:
from pyisemail import is_email
address = "[email protected]"
bool_result = is_email(address)
detailed_result = is_email(address, diagnose=True)
You can also check whether the domain used in the email is a valid domain and whether or not it has a valid MX record:
from pyisemail import is_email
address = "[email protected]"
bool_result_with_dns = is_email(address, check_dns=True)
detailed_result_with_dns = is_email(address, check_dns=True, diagnose=True)
These are primary indicators of whether an email address can even be issued at that domain. However, a valid response here is not a guarantee that the email exists, merely that is can exist.
In addition to the base is_email functionality, you can also use the validators by themselves. Check the validator source doc to see how this works.
Upvotes: 2
Reputation: 2929
I converted the code into Java to match the compiler:
String pattern = "(?:[a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+)*|\"(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21\\x23-\\x5b\\x5d-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])*\")@(?:(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?\\.)+[a-zA-Z0-9](?:[a-zA-Z0-9-]*[a-zA-Z0-9])?|\\[(?:(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9]))\\.){3}(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])|[a-zA-Z0-9-]*[a-zA-Z0-9]:(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21-\\x5a\\x53-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])+)\\])";
Upvotes: 1
Reputation: 312
List item
I use this function
function checkmail($value) {
$value = trim($value);
if (stristr($value,"@") &&
stristr($value,".") &&
(strrpos($value, ".") - stripos($value, "@") > 2) &&
(stripos($value, "@") > 1) &&
(strlen($value) - strrpos($value, ".") < 6) &&
(strlen($value) - strrpos($value, ".") > 2) &&
($value == preg_replace('/[ ]/', '', $value)) &&
($value == preg_replace('/[^A-Za-z0-9\-_.@!*]/', '', $value))
)
{
}
else {
return "Invalid Mail-Id";
}
}
Upvotes: 1
Reputation: 145950
Just about every regular expression I've seen - including some used by Microsoft will not allow the following valid email to get through: [email protected]
I just had a real customer with an email address in this format who couldn't place an order.
Here's what I settled on:
MailAddress
constructor with some additional checks (see below):.cmo
or .gmial.com
and asking for confirmation "Are you sure this is your correct email address. It looks like there may be a mistake." Allow the user to accept what they typed if they are sure.try
{
var email = new MailAddress(str);
if (email.Host.EndsWith(".cmo"))
{
return EmailValidation.PossibleTypo;
}
if (!email.Host.EndsWith(".") && email.Host.Contains("."))
{
return EmailValidation.OK;
}
}
catch
{
return EmailValidation.Invalid;
}
Upvotes: 4
Reputation: 1004
For me the right way for checking email addresses is:
/^[^@]+@[^@]+$/
Of course, you can show some warning or tooltip in front-end when the user typed a "strange" email to help him/her to avoid common mistakes, like no dot in the domain part or spaces in name without quoting and so on. But you must accept the address "hello@world" if user really want it.
Also, you must remember that the email address standard was and can evolve, so you can't just type some "standard-valid" regexp once and for all times. And you must remember that some concrete internet servers can fail some details of common standard and in fact work with own "modified standard".
So, just check @, hint user on frontend and send verification emails on the given address.
Upvotes: 4
Reputation: 1488
The HTML5 specification suggests a simple regex for validating email addresses:
/^[a-zA-Z0-9.!#$%&'*+\/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/
This intentionally doesn't comply with RFC 5322.
Note: This requirement is a wilful violation of RFC 5322, which defines a syntax for e-mail addresses that is simultaneously too strict (before the
@
character), too vague (after the@
character), and too lax (allowing comments, whitespace characters, and quoted strings in manners unfamiliar to most users) to be of practical use here.
The total length could also be limited to 254 characters, per RFC 3696 errata 1690.
Upvotes: 26
Reputation: 1812
For PHP I'm using the email address validator from the Nette Framework:
/* public static */ function isEmail($value)
{
$atom = "[-a-z0-9!#$%&'*+/=?^_`{|}~]"; // RFC 5322 unquoted characters in local-part
$localPart = "(?:\"(?:[ !\\x23-\\x5B\\x5D-\\x7E]*|\\\\[ -~])+\"|$atom+(?:\\.$atom+)*)"; // Quoted or unquoted
$alpha = "a-z\x80-\xFF"; // Superset of IDN
$domain = "[0-9$alpha](?:[-0-9$alpha]{0,61}[0-9$alpha])?"; // RFC 1034 one domain component
$topDomain = "[$alpha](?:[-0-9$alpha]{0,17}[$alpha])?";
return (bool) preg_match("(^$localPart@(?:$domain\\.)+$topDomain\\z)i", $value);
}
Upvotes: 5
Reputation: 766
You can use following regular expression for any email address:
^(([^<>()[\]\\.,;:\s@\"]+(\.[^<>()[\]\\.,;:\s@\"]+)*)|(\".+\"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$
function checkEmailValidation($email)
{
$expression = '/^(([^<>()[\]\\.,;:\s@\"]+(\.[^<>()[\]\\.,;:\s@\"]+)*)|(\".+\"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$/';
if(preg_match($expression, $email))
{
return true;
}
else
{
return false;
}
}
function checkEmailValidation(email)
{
var pattern = '/^(([^<>()[\]\\.,;:\s@\"]+(\.[^<>()[\]\\.,;:\s@\"]+)*)|(\".+\"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$/';
if(pattern.test(email))
{
return true;
}
else
{
return false;
}
}
Upvotes: 0
Reputation: 722
Following is the regular expression for validating an email address:
^.+@\w+(\.\w+)+$
Upvotes: 3
Reputation: 1793
As mentioned already, you can't validate an email with a regex. However, here's what we currently use to make sure user-input isn't totally bogus (forgetting the TLD, etc.).
This regex will allow IDN domains and special characters (like Umlauts) before and after the @ sign.
/^[\w.+-_]+@[^.][\w.-]*\.[\w-]{2,63}$/iu
Upvotes: 1
Reputation: 10943
Java Mail API does magic for us.
try
{
InternetAddress internetAddress = new InternetAddress(email);
internetAddress.validate();
return true;
}
catch(Exception ex)
{
return false;
}
I got this from here.
Upvotes: 2
Reputation: 1362
The regular expressions posted for this question are out of date now, because of the new generic top-level domains (gTLDs) coming in (e.g. .london, .basketball, .通販). To validate an email address there are two answers (that would be relevant to the vast majority).
{something}@{something}.{something}
. There's no point in going for a detailed regex, because you won't catch them all and there'll be a new batch in a few years and you'll have to update your regular expression again.I have decided to use the regular expression because, unfortunately, some users don't read forms and put the wrong data in the wrong fields. This will at least alert them when they try to put something which isn't an email into the email input field and should save you some time supporting users on email issues.
(.+)@(.+){2,}\.(.+){2,}
Upvotes: 3
Reputation: 1585
There has nearly been added a new domain, "yandex". Possible emails: [email protected]. And also uppercase letters are supported, so a bit modified version of acrosman's solution is:
^[_a-zA-Z0-9-]+(\.[_a-zA-Z0-9-]+)*@[a-zA-Z0-9-]+(\.[a-zA-Z0-9-]+)*(\.[a-zA-Z]{2,6})$
Upvotes: 2
Reputation: 113
If you want to improve on a regex that has been working reasonably well over several years, then the answer depends on what exactly you want to achieve - what kinds of email addresses have been failing. Fine-tuning email regexes is very difficult, and I have yet to see a perfect solution.
The leading answer to your question currently links to a "fully RFC‑822–compliant regex". However, in spite of the complexity of that regex and its presumed attention to detail in RFC rules, it completely fails when it comes to Unicode support.
The regex that I've written for most of my applications focuses on Unicode support, as well as reasonably good overall adherence to RFC standards:
/^(?!\.)((?!.*\.{2})[a-zA-Z0-9\u0080-\u00FF\u0100-\u017F\u0180-\u024F\u0250-\u02AF\u0300-\u036F\u0370-\u03FF\u0400-\u04FF\u0500-\u052F\u0530-\u058F\u0590-\u05FF\u0600-\u06FF\u0700-\u074F\u0750-\u077F\u0780-\u07BF\u07C0-\u07FF\u0900-\u097F\u0980-\u09FF\u0A00-\u0A7F\u0A80-\u0AFF\u0B00-\u0B7F\u0B80-\u0BFF\u0C00-\u0C7F\u0C80-\u0CFF\u0D00-\u0D7F\u0D80-\u0DFF\u0E00-\u0E7F\u0E80-\u0EFF\u0F00-\u0FFF\u1000-\u109F\u10A0-\u10FF\u1100-\u11FF\u1200-\u137F\u1380-\u139F\u13A0-\u13FF\u1400-\u167F\u1680-\u169F\u16A0-\u16FF\u1700-\u171F\u1720-\u173F\u1740-\u175F\u1760-\u177F\u1780-\u17FF\u1800-\u18AF\u1900-\u194F\u1950-\u197F\u1980-\u19DF\u19E0-\u19FF\u1A00-\u1A1F\u1B00-\u1B7F\u1D00-\u1D7F\u1D80-\u1DBF\u1DC0-\u1DFF\u1E00-\u1EFF\u1F00-\u1FFFu20D0-\u20FF\u2100-\u214F\u2C00-\u2C5F\u2C60-\u2C7F\u2C80-\u2CFF\u2D00-\u2D2F\u2D30-\u2D7F\u2D80-\u2DDF\u2F00-\u2FDF\u2FF0-\u2FFF\u3040-\u309F\u30A0-\u30FF\u3100-\u312F\u3130-\u318F\u3190-\u319F\u31C0-\u31EF\u31F0-\u31FF\u3200-\u32FF\u3300-\u33FF\u3400-\u4DBF\u4DC0-\u4DFF\u4E00-\u9FFF\uA000-\uA48F\uA490-\uA4CF\uA700-\uA71F\uA800-\uA82F\uA840-\uA87F\uAC00-\uD7AF\uF900-\uFAFF\.!#$%&'*+-/=?^_`{|}~\-\d]+)@(?!\.)([a-zA-Z0-9\u0080-\u00FF\u0100-\u017F\u0180-\u024F\u0250-\u02AF\u0300-\u036F\u0370-\u03FF\u0400-\u04FF\u0500-\u052F\u0530-\u058F\u0590-\u05FF\u0600-\u06FF\u0700-\u074F\u0750-\u077F\u0780-\u07BF\u07C0-\u07FF\u0900-\u097F\u0980-\u09FF\u0A00-\u0A7F\u0A80-\u0AFF\u0B00-\u0B7F\u0B80-\u0BFF\u0C00-\u0C7F\u0C80-\u0CFF\u0D00-\u0D7F\u0D80-\u0DFF\u0E00-\u0E7F\u0E80-\u0EFF\u0F00-\u0FFF\u1000-\u109F\u10A0-\u10FF\u1100-\u11FF\u1200-\u137F\u1380-\u139F\u13A0-\u13FF\u1400-\u167F\u1680-\u169F\u16A0-\u16FF\u1700-\u171F\u1720-\u173F\u1740-\u175F\u1760-\u177F\u1780-\u17FF\u1800-\u18AF\u1900-\u194F\u1950-\u197F\u1980-\u19DF\u19E0-\u19FF\u1A00-\u1A1F\u1B00-\u1B7F\u1D00-\u1D7F\u1D80-\u1DBF\u1DC0-\u1DFF\u1E00-\u1EFF\u1F00-\u1FFF\u20D0-\u20FF\u2100-\u214F\u2C00-\u2C5F\u2C60-\u2C7F\u2C80-\u2CFF\u2D00-\u2D2F\u2D30-\u2D7F\u2D80-\u2DDF\u2F00-\u2FDF\u2FF0-\u2FFF\u3040-\u309F\u30A0-\u30FF\u3100-\u312F\u3130-\u318F\u3190-\u319F\u31C0-\u31EF\u31F0-\u31FF\u3200-\u32FF\u3300-\u33FF\u3400-\u4DBF\u4DC0-\u4DFF\u4E00-\u9FFF\uA000-\uA48F\uA490-\uA4CF\uA700-\uA71F\uA800-\uA82F\uA840-\uA87F\uAC00-\uD7AF\uF900-\uFAFF\-\.\d]+)((\.([a-zA-Z\u0080-\u00FF\u0100-\u017F\u0180-\u024F\u0250-\u02AF\u0300-\u036F\u0370-\u03FF\u0400-\u04FF\u0500-\u052F\u0530-\u058F\u0590-\u05FF\u0600-\u06FF\u0700-\u074F\u0750-\u077F\u0780-\u07BF\u07C0-\u07FF\u0900-\u097F\u0980-\u09FF\u0A00-\u0A7F\u0A80-\u0AFF\u0B00-\u0B7F\u0B80-\u0BFF\u0C00-\u0C7F\u0C80-\u0CFF\u0D00-\u0D7F\u0D80-\u0DFF\u0E00-\u0E7F\u0E80-\u0EFF\u0F00-\u0FFF\u1000-\u109F\u10A0-\u10FF\u1100-\u11FF\u1200-\u137F\u1380-\u139F\u13A0-\u13FF\u1400-\u167F\u1680-\u169F\u16A0-\u16FF\u1700-\u171F\u1720-\u173F\u1740-\u175F\u1760-\u177F\u1780-\u17FF\u1800-\u18AF\u1900-\u194F\u1950-\u197F\u1980-\u19DF\u19E0-\u19FF\u1A00-\u1A1F\u1B00-\u1B7F\u1D00-\u1D7F\u1D80-\u1DBF\u1DC0-\u1DFF\u1E00-\u1EFF\u1F00-\u1FFF\u20D0-\u20FF\u2100-\u214F\u2C00-\u2C5F\u2C60-\u2C7F\u2C80-\u2CFF\u2D00-\u2D2F\u2D30-\u2D7F\u2D80-\u2DDF\u2F00-\u2FDF\u2FF0-\u2FFF\u3040-\u309F\u30A0-\u30FF\u3100-\u312F\u3130-\u318F\u3190-\u319F\u31C0-\u31EF\u31F0-\u31FF\u3200-\u32FF\u3300-\u33FF\u3400-\u4DBF\u4DC0-\u4DFF\u4E00-\u9FFF\uA000-\uA48F\uA490-\uA4CF\uA700-\uA71F\uA800-\uA82F\uA840-\uA87F\uAC00-\uD7AF\uF900-\uFAFF]){2,63})+)$/i
I'll avoid copy-pasting complete answers, so I'll just link this to a similar answer I provided here: How to validate a unicode email?
There is also a live demo available for the regex above at: http://jsfiddle.net/aossikine/qCLVH/3/
Upvotes: 3