Varshini Adiga
Varshini Adiga

Reputation: 21

What is the right way to write regular expression in C++?

Having hard time writing below regex expression in C++

(?=[a-zA-Z])*(?=[\s])?(00|\+)[\s]?[0-9]+[\s]?[0-9]+(?=[\sa-zA-Z])*

Example string: "ABC + 91 9997474545 DEF"

Matched string must be: "+ 91 9997474545"

C++ code :

#include <iostream> 
#include <regex> 

using namespace std; 
int main() 
{ 
    string a = "ABC + 91 9997474545 DEF"; 
    try
    {
        regex b("(?=[a-zA-Z])*(?=[\\s])?(00|\\+)[\\s]?[0-9]+[\\s]?[0-9]+(?=[\\sa-zA-Z])*"); 

        smatch amatch;
        if ( regex_search(a, amatch, b) )
        {
            for(const auto& aMa : amatch)
            {
                cout<< "match :" <<aMa.str()<<endl;
            }
        }
    }
    catch (const regex_error& err)
    { 
        std::cout << "There was a regex_error caught: " << err.what() << '\n'; 
    }
    return 0; 
}

Output:

There was a regex_error caught: regex_error

What is wrong in the code?

Upvotes: 2

Views: 107

Answers (1)

tdao
tdao

Reputation: 17668

Edit: an improved version (based on Toto comment):

regex b(R"(([alpha]*\s*)(\+?\s*\d+\s*\d+)(\s*[alpha]*))");
  • Use [alpha] character class which is alphabetic character - instead of \w which can contain digits as well.
  • In second/main group (\+?\s*\d+\s*\d+) use + to force at least one digit.

Two suggestions to make your code more readable:

  • Use raw string (R) to avoid double quote
  • Use character class such as \w (for letters), \s (for spaces), \d (for digit)

Then your regex could be simplified like this:

regex b(R"((\w*\s*)(\+?\s*\d*\s*\d*)(\s*\w*))");

which would yield the results (assume you want to extract the number with optional plus sign):

match :ABC + 91 9997474545 DEF
match :ABC 
match :+ 91 9997474545
match : DEF

Note the regex above contains 3 groups:

  • (\w*\s*) - some preceding letters and spaces
  • (+?\s*\d*\s*\d*) - plus sign then some digits (91), some optional space, and some other digits (9997474545)
  • (\s*\w*) - some spaces, then some letters.

Upvotes: 1

Related Questions