Reputation: 73
C++ Shell Online Execution Link: http://cpp.sh/5z2uq
I am writing a regex to validate an email ID which can have multiple dots and plus characters in its local name and can only have one dot in the domain name.
The problem I'm facing now is in capture group. My domain name capture, i.e. group #2 is working as expected, as seen in the output. But, when I try to capture local name i.e. group #1,
it is not supposed to capture anything past the '+' sign not including '+', and after capturing local name, output has a missing last character.
Please take a look at my C++ regex code:
#include <iostream>
#include <regex>
using namespace std;
int main()
{
string str;
vector<string> emails = {
"[email protected]",
"[email protected]",
"[email protected]",
"[email protected]",
"[email protected]"
};
for(auto ele : emails)
{
str = ele;
regex e("([\\w+\\.]+)\\+*[\\+\\w]+\\@([\\w]+\\.[\\w]+)$");
smatch parts;
bool match = regex_match(str,parts,e);
if(match==true)
{
cout << "Local : " << parts.str(1) << endl;
cout << "Domain : " << parts.str(2) << endl;
cout << "Valid Email ID: " << ele << endl << endl;
}
else
{
cout << "Invalid Email ID: " << ele << endl << endl;
}
}
return 0;
}
Output:
Local : loca
Domain : domain.com
Valid Email ID: [email protected]Local : local.constan
Domain : domain.com
Valid Email ID: [email protected]Local : local+addo
Domain : domain.com
Valid Email ID: [email protected]Local : local.constant+addo
Domain : domain.com
Valid Email ID: [email protected]Invalid Email ID: [email protected]
Notice how, in the local variable, my regex group capture is dropping the last character.
Questions:
Upvotes: 1
Views: 76
Reputation: 9130
You can use this expression:
"([\\w.]+)(?:\\+[\\w]+)*\\@([\\w]+\\.[\\w]+)$"
The first part ([\\w.]+)
matches the Local part (i.e. any word character or dot)
The second part (?:\\+[\\w]+)*
denotes a non-capturing group repeated 0 or more times (matching a plus sign folowed by one or more word characters).
The third part \\@
matches the @ character.
The last part ([\\w]+\\.[\\w]+)
matches the Domain part (i.e. two words separated with one dot), which you got right.
Upvotes: 1