Meekaa Saangoo
Meekaa Saangoo

Reputation: 349

How to make HTML5 email validation regex work in C++?

I am trying to validate email both on the client-side and on the server-side. The client-side is JavaScript(web front-end). The server-side is written in C++11.

The regex I am using to validate email is provided by the HTML standard (here)[https://html.spec.whatwg.org/multipage/input.html#e-mail-state-(type=email)]. I am reproducing it here for quick reference:

/^[a-zA-Z0-9.!#$%&'*+\/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/

The validation works on the client-side using JavaScript. But the server-side validation using std::regex_match fails.

Following is the C++ code to check valid email:

bool is_valid_email(std::string email)
{
    // Regex from HTML5 spec.
    static std::regex const email_regex {R"(/^[a-zA-Z0-9.!#$%&'*+\/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/)"};

    return std::regex_match(email, email_regex);
}

What am I doing wrong?

Upvotes: 1

Views: 183

Answers (2)

JJCUBER
JJCUBER

Reputation: 21

The regex you are running is expecting a / before the start (^) and after the end ($) of the string. You need to remove the /^ and $/ at the beginning and end:

"([a-zA-Z0-9.!#$%&'*+\/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*)"

Upvotes: -1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627087

The / at both ends of the JavaScript regex literal are regex delimiter characters, they are not part of a regular expression pattern.

In C++, you set the regex using either regular or raw string literals, do you do not need to include regex delimiters into the pattern.

So, if you have const regex = /abc/ in JavaScript, you may use

std::regex const regex {R"(abc)"};

In your case, you do not even need the ^ at the start and $ at the end of the pattern since regex_match requires a full string match:

bool is_valid_email(std::string email)
{
    // Regex from HTML5 spec.
    static std::regex const email_regex {R"([a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*)"};
    return std::regex_match(email, email_regex);
}

Also, / is not a special regex metacharacter, you do not need to escape it.

NOTE Since the latest JavaScript ECMAScript implementations support many more regex feature, like infinite-width lookbehind, named capturing groups, it is not always so straight-forward to convert a JavaScript regex pattern to a C++ compatible regex pattern.

Upvotes: 2

Related Questions