Anh
Anh

Reputation: 25

Regex for email address not accept '&' before 'dot'

This is part of function to validate email addresses in phpbb. I have a VALID email address

"pwd-p&r2.coop@..." but it is not accepted by the validation.

The interesting thing is that when I tried "pwd-pr2.co&op@..." (with '&' after 'dot')

and "pwd-pr&2coop@..." (without 'dot'), both are valid,

but "pwd-pr&2.coop@..." (with 'dot' after '&') is not.

I have tried to change the regex for several days but still not figure out how to fix it so that it accepts my email address.

function get_preg_expression($mode)
{
    switch ($mode)
    {
        case 'email':
        // Regex written by James Watts and Francisco Jose Martin Moreno
        // http://fightingforalostcause.net/misc/2006/compare-email-regex.php
        return '([\w\!\#$\%\&\'\*\+\-\/\=\?\^\`{\|\}\~]+\.)*(?:[\w\!\#$\%\&\'\*\+\-\/\=\?\^\`{\|\}\~]|&)+@((((([a-z0-9]{1}[a-z0-9\-]{0,62}[a-z0-9]{1})|[a-z])\.)+[a-z]{2,63})|(\d{1,3}\.){3}\d{1,3}(\:\d{1,5})?)';
    break;
    }
}

Upvotes: 2

Views: 1502

Answers (2)

Spudley
Spudley

Reputation: 168715

Emails should not be validated using a regex, for exactly the reasons that are happening in this question -- it's complex, and virtually nobody gets it right.

PHP has a built-in email validator in the form of the filter_var() function. This is generally the best option for validating emails.

It's a one-line piece of code, without any complex regular expressions required at all.

Examples copied from the above link (ie the PHP manual):

<?php
$email_a = '[email protected]';
$email_b = 'bogus';

if (filter_var($email_a, FILTER_VALIDATE_EMAIL)) {
    echo "This (email_a) email address is considered valid.";
}
if (filter_var($email_b, FILTER_VALIDATE_EMAIL)) {
    echo "This (email_b) email address is considered valid.";
}
?>

Hope that helps.

Upvotes: 3

sigpwned
sigpwned

Reputation: 7443

Try this:

function get_preg_expression($mode)
{
    switch ($mode)
    {
        case 'email':
        // Regex written by James Watts and Francisco Jose Martin Moreno
        // http://fightingforalostcause.net/misc/2006/compare-email-regex.php
        // return '^(?:[\w!#$%&\'*+-\/=?^`{|}~]+\.)*(?:[\w!#$%&\'*+-\/=?^`{|}~]|&amp;)+$';
        return '(?:[\w!#$%&\'*+-\/=?^`{|}~]+\.)*(?:[\w!#$%&\'*+-\/=?^`{|}~]|&amp;)+@((((([a-z0-9]{1}[a-z0-9\-]{0,62}[a-z0-9]{1})|[a-z])\.)+[a-z]{2,63})|(\d{1,3}\.){3}\d{1,3}(\:\d{1,5})?)';
    break;
    }
}

I unescaped the characters that didn't need escaping in the leading bracket expressions. That seems to have done the trick. More specifically, the only characters I left escaped were: (a) ', due to PHP string escape syntax, and (b) / because I used / as PHP's preg delimiter.

I don't have a theory about why this unescaping helped, unfortunately. However, it's a best practice only to escape what needs escaping, so I arrived at the solution in light of what you "should" do, so I'm comfortable sharing. The answer isn't satisfying, but at least it seems to work.

Here's the test harness I used for testing, in case it's useful, plus matchiness:

function get_preg_expression($mode)
{
    switch ($mode)
    {
        case 'email':
        // Regex written by James Watts and Francisco Jose Martin Moreno
        // http://fightingforalostcause.net/misc/2006/compare-email-regex.php
        // return '^(?:[\w!#$%&\'*+-\/=?^`{|}~]+\.)*(?:[\w!#$%&\'*+-\/=?^`{|}~]|&amp;)+$';
        return '(?:[\w!#$%&\'*+-\/=?^`{|}~]+\.)*(?:[\w!#$%&\'*+-\/=?^`{|}~]|&amp;)+@((((([a-z0-9]{1}[a-z0-9\-]{0,62}[a-z0-9]{1})|[a-z])\.)+[a-z]{2,63})|(\d{1,3}\.){3}\d{1,3}(\:\d{1,5})?)';
    break;
    }
}

var_dump(preg_match('/' . get_preg_expression('email') . '/', 'pwd-pr&[email protected]')); // Matches
var_dump(preg_match('/' . get_preg_expression('email') . '/', 'pwd-pr&[email protected]')); // Matches
var_dump(preg_match('/' . get_preg_expression('email') . '/', 'pwd-p&[email protected]')); // Matches
var_dump(preg_match('/' . get_preg_expression('email') . '/', 'hello')); // Does not match
var_dump(preg_match('/' . get_preg_expression('email') . '/', 'hello@world')); // Does not match
var_dump(preg_match('/' . get_preg_expression('email') . '/', '[email protected]')); // Matches

Upvotes: 1

Related Questions