Seb
Seb

Reputation: 11

RegEx Email validation issue

Before I jump into my question, let me preface with this: I had a strict set of requirements to follow with regards to email address validation. I attempted to dispute some of them, but was overruled.

Anyways, amongst the requirements were the following:

My attempt to satisfy the requirement was successful, with one snag. An incorrect minimum of 3 characters is now required due to the regex I am using for the local part. Here is my attempt:

(^(?!.*\\.{2})([a-zA-Z0-9{1}]+[a-zA-Z0-9\\._\\-\\+!#$%&*/=?`{|}~']+[a-zA-Z0-9{1}])+@([a-zA-Z0-9{1}]+[a-zA-Z0-9\\-]+[a-zA-Z0-9{1}]+\\.)+([a-zA-Z0-9\\-]{2}|net|com|gov|mil|org|edu|int|NET|COM|GOV|MIL|ORG|EDU|INT)$)|^$

I understand why this is happening, I just don't know how to get around it. Any assistance would be greatly appreciated.

Edited: After much discussion, it turns out that my issues were not specific to the local part of the email address. The domain part is also suffering from the same issues.

Thanks, Seb

Upvotes: 0

Views: 738

Answers (2)

nhahtdh
nhahtdh

Reputation: 56829

For the local part (the part before @), this is the regex fragment that satisfies all conditions above:

^[a-zA-Z0-9][a-zA-Z0-9+!#$%&*/=?`{|}~'_-]*(\.[a-zA-Z0-9+!#$%&*/=?`{|}~'_-]+)*

Breakdown:

^                                 # Beginning of the string
[a-zA-Z0-9]                       # First character is not special
[a-zA-Z0-9+!#$%&*/=?`{|}~'_-]*    # 0 or more alphanumeric and special characters, except .
(?:                               # Group, repeated 0 or more times
  \.                              # A literal .
  [a-zA-Z0-9+!#$%&*/=?`{|}~'_-]+  # 1 or more alphanumeric and special characters, except .
)*

The "No consecutive periods" and "No periods directly before or after the @" conditions are enforced by the fact that . can only appear between 2 non-dot characters, as seen in the regex above.

I don't have a full knowledge of the email specification, so even if it satisfies the conditions in the question, I can't guarantee that the email is a valid one according to specs.


The domain part has same problem with {1} inside the character class.

I take the liberty to use the restriction on hostname, where the labels must not start or end with -.

[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*(?:\.[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*)*

If you want to enforce TLD:

[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*(?:\.[a-zA-Z0-9]+(?:-[a-zA-Z0-9]+)*)*\.(?i:[a-z0-9]{2}|net|com|gov|mil|org|edu|int)

Note that I make the TLD case-insensitive using the non-capturing group with i flag.

Upvotes: 2

oddparity
oddparity

Reputation: 436

Could you please try this (just slight modifications to your code):

(^(?!.*\\.{2})([a-zA-Z0-9][a-zA-Z0-9\\._\\-\\+!#$%&*/=?`{|}~']+[a-zA-Z0-9])+@([a-zA-Z0-9]+[a-zA-Z0-9\\-]+[a-zA-Z0-9]\\.)+([a-zA-Z0-9\\-]{2}|net|com|gov|mil|org|edu|int|NET|COM|GOV|MIL|ORG|EDU|INT)$)|^$

(The test addresses provided so far work. They all don't match.)

Upvotes: 0

Related Questions