Jack Thor
Jack Thor

Reputation: 1602

what is the '[' in Regex in this context ( 2 problems)

My final Regex (Updated):

^[A-z0-9]*([\.]?\w*)*[A-z0-9]+@[A-Za-z0-9]([\.]?\w+)+\.[A-Za-z][A-Za-z][A-Za-z]$

1st Question

I know that the square brackets [] are used for character set/class, but what I don't understand is that why it does not need a closing bracket, and causes my Reg-ex to act weirdly. For example here it is originally

[A-z0-9]*[\.?[\w+]+[A-z0-9]+@[A-Za-z0-9][\.?[A-Za-z0-9+]+\.[A-Za-z][A-Za-z][A-Za-z]
Matches ----> [email protected]

Notice that the second [ before the \. as well as the sixth [ before the second \. Those two does not have any closing brackets. I also know that the ? mean zero or one of the preceding element. When I remove the bracket 9]*\.?[\w the regex does not recgnize the reg. part of the string. I also tried adding a closing bracket [\.?[\w+]], [\.?][\w+], but it also causes the same behavior, note that the first one [\.?[\w+]] causes it not to recognize the string at all. Can anyone help explain this behavior?

2nd Question

Base upon the above Regex I am using the JavaScript regex to test my string. And I don't know why the test is passing even though only part of the string pass the test. Some of my test cases

reg@[email protected]
reg.exp.email@[email protected]
reg.exp.email@[email protected]
[email protected]

I should clarify only [email protected] should pass

If you go to this site and past my regex and test string you can see what I mean. In my javascript I have this

var inputVal = $(this).val();       
    var re = /[A-z0-9]*[\.?[\w+]+[A-z0-9]+@[A-Za-z0-9][\.?[A-Za-z0-9+]+\.[A-Za-z][A-Za-z][A-Za-z]/;
    if (!re.test(inputVal)) {
        $("#emailValidate").css({"display": "inline-block", "color":"red","margin-top":"4px"});          
    } else {
        $("#emailValidate").css("display", "none");
    }

Which the test case above pass. Any idea on this one?

Upvotes: 0

Views: 83

Answers (2)

peterfoldi
peterfoldi

Reputation: 7471

I know that it was already answered, but ^$ alone don't answer all MY questions. The 2 '[' characters are still useless. For example the first part of the regex (before @) allows only maximum 1 dot in the email address, and the only thing that makes it match with multiple dots is the extra '[' but that shouldn't have this effect. I think now I found what it was supposed to be in the first place:

^[A-z0-9]*(\.?\w+)+@...

The + after the w also had no effect in the original regex in my opinion when it was inside [] and not () as above. Or if Jack wants only match exactly 2 dots then:

^[A-z0-9]*(\.\w+){2}@...

I just don't believe that that extra '[' and the '+' inside [] makes it a reliable pattern. Please correct me sniffer if I am wrong.

Upvotes: 0

Ibrahim Najjar
Ibrahim Najjar

Reputation: 19423

Answer to first question

Including an open bracket [ inside a character class is fine, and it doesn't create another character class inside the first one. This is why it is accepted to include it without escaping inside the character class.

Answer to second question

You need to enclose your pattern between start-of-line ^ and end-of-line $ anchors so it matches the entire input string and not parts of it like this:

^[A-z0-9]*[\.?[\w+]+[A-z0-9]+@[A-Za-z0-9][\.?[A-Za-z0-9+]+\.[A-Za-z][A-Za-z][A-Za-z]$
|                                                                                   |
Start-Of-Line                                                             End_Of_Line

Regex101 Demo

Upvotes: 2

Related Questions