123tv
123tv

Reputation: 143

C++ standard regex difference between std=c++11 and std=gnu++11

There seems to be a difference in regex behavior when comiling code using regex and gnu extensions.

The following code produces an exception when compiliing with -std=c++11, however -std=gnu++11 works:

#include <regex>
#include <iostream>

int main(int argc, char **argv) {

    std::string rex { "\\[1\\]" };
    std::string str { "[1]" };
    std::regex regex(rex, std::regex::extended);
    auto match = std::regex_match(str.begin(), str.end(), regex);
    std::cout << "Result is " << match << std::endl;
    return 0;
}

I tried gcc from 4.9.4 up to 9.2 with same behavior. Any ideas why this code behaves differently?

Upvotes: 1

Views: 849

Answers (1)

interjay
interjay

Reputation: 110108

std::regex::extended uses extended POSIX regular expressions. According to those syntax rules, a backslash can only precede a "special character", which is one of .[\()*+?{|^$. While a left bracket [ is a special character, the right bracket ] is not. So your regular expression should be "\\[1]" instead of "\\[1\\]" to be standard-compliant.

Looking at the standard library source code, there is the following in regex_scanner.tcc:

#ifdef __STRICT_ANSI__
      // POSIX says it is undefined to escape ordinary characters
      __throw_regex_error(regex_constants::error_escape,
                  "Unexpected escape character.");
#else
      _M_token = _S_token_ord_char;
      _M_value.assign(1, __c);
#endif

Which shows that it is a GNU extension to allow escaping non-special characters. I don't know where this extension is documented.

Upvotes: 3

Related Questions