mlgpro
mlgpro

Reputation: 171

C++ regex_match not working

Here is part of my code

bool CSettings::bParseLine ( const char* input )
{
    //_asm INT 3


    std::string line ( input );
    std::size_t position = std::string::npos, comment;

    regex cvarPattern ( "\\.([a-zA-Z_]+)" );
    regex parentPattern ( "^([a-zA-Z0-9_]+)\\." );
    regex cvarValue ( "\\.[a-zA-Z0-9_]+[ ]*=[ ]*(\\d+\\.*\\d*)" );
    std::cmatch matchedParent, matchedCvar;


    if ( line.empty ( ) )
        return false;

    if ( !std::regex_match ( line.c_str ( ), matchedParent, parentPattern ) )
        return false;

    if ( !std::regex_match ( line.c_str ( ), matchedCvar, cvarPattern ) )
        return false;
...
}

I try to separate with it lines which I read from file - lines look like:

foo.bar = 15
baz.asd = 13
ddd.dgh = 66

and I want to extract parts from it - e.g. for 1st line foo.bar = 15, I want to end up with something like:

a = foo
b = bar
c = 15

but now, regex is returning always false, I tested it on many online regex checkers, and even in visual studio, and it's working great, do I need some different syntax for C++ regex_match? I'm using visual studio 2013 community

Upvotes: 7

Views: 12056

Answers (2)

Galik
Galik

Reputation: 48645

The problem is that std::regex_match must match the entire string but you are trying to match only part of it.

You need to either use std::regex_search or alter your regular expression to match all three parts at once:

#include <regex>
#include <string>
#include <iostream>

const auto test =
{
      "foo.bar = 15"
    , "baz.asd = 13"
    , "ddd.dgh = 66"
};

int main()
{
    const std::regex r(R"~(([^.]+)\.([^\s]+)[^0-9]+(\d+))~");
    //                     (  1  )  (   2  )       ( 3 ) <- capture groups

    std::cmatch m;

    for(const auto& line: test)
    {
        if(std::regex_match(line, m, r))
        {
            // m.str(0) is the entire matched string
            // m.str(1) is the 1st capture group
            // etc...
            std::cout << "a = " << m.str(1) << '\n';
            std::cout << "b = " << m.str(2) << '\n';
            std::cout << "c = " << m.str(3) << '\n';
            std::cout << '\n';
        }
    }
}

Regular expression: https://regex101.com/r/kB2cX3/2

Output:

a = foo
b = bar
c = 15

a = baz
b = asd
c = 13

a = ddd
b = dgh
c = 66

Upvotes: 16

πάντα ῥεῖ
πάντα ῥεῖ

Reputation: 1

To focus on regex patterns I'd prefer to use raw string literals in c++:

regex cvarPattern ( R"rgx(\.([a-zA-Z_]+))rgx" );
regex parentPattern ( R"rgx(^([a-zA-Z0-9_]+)\.)rgx" );
regex cvarValue ( R"rgx(\.[a-zA-Z0-9_]+[ ]*=[ ]*(\d+\.*\d*))rgx" );

Everything between the rgx( )rgx delimiters doesn't need any extra escaping for c++ char literal characters.


Actually what you have written in your question resembles to those regular expressions I've been writing as raw string literals.
You probably simply meant something like

regex cvarPattern ( R"rgx(.([a-zA-Z_]+))rgx" );
regex parentPattern ( R"rgx(^([a-zA-Z0-9_]+).)rgx" );
regex cvarValue ( R"rgx(.[a-zA-Z0-9_]+[ ]*=[ ]*(\d+(\.\d*)?))rgx" );

I didn't dig in deeper, but I'm not getting all of these escaped characters in your regular expression patterns now.


As for your question in the comment, you can use a choice of matching sub-pattern groups, and check for which of them was applied in the matches structure:

regex cvarValue ( 
   R"rgx(.[a-zA-Z0-9_]+[ ]*=[ ]*((\d+)|(\d+\.\d?)|([a-zA-Z]+)){1})rgx" );
                             // ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

You probably don't need these cvarPattern and parentPattern regular expressions to inspect other (more detailed) views about the matching pattern.

Upvotes: 4

Related Questions