Bámer Balázs
Bámer Balázs

Reputation: 169

C++ regular expressions fails, while online checker works

I have this code:

class Clazz {
private:
  constexpr char _csVersionPattern[] = "^[^\\(\\[\\)\\],]+$";
  //constexpr char _csVersionPattern[] = "(^([\\(\\[])[!-'\\*+\\.-Z\\\\^-z\\|~-]*,[!-'\\*+\\.-Z\\\\^-z\\|~-]*([\\)\\]])$)|(^[^\\(\\[\\)\\],]+$)";
  constexpr char _csIdPattern[] = "^[!-~]+$";
public:
  void func(std::string const& aId, std::string const& aVersion) {
    std::regex idRegex{ _csIdPattern, std::regex::extended };
    std::regex versionRegex{ _csVersionPattern, std::regex::extended };
    auto validId = std::regex_match(aId, idRegex);
    auto validVersion = std::regex_match(aVersion, versionRegex);
    _valid = (validId && validVersion);
  }
};

When I call it as object.func("id", "version"); validId will be true, and validVersion false. If I take the more complex pattern in comment, it also fails. This happens in Visual Studio 2019, and in recent g++ and clang++ too. However, when I try here the same version pattern: ^[^\(\[\)\],]+$ it matches the string "version". Also the complex variant works. The patterns compile in std::regex constructor (no exception). What do I do wrong?

Thanks in advance.

Edit: Here it is on Godbolt. In the original it is C++14, here C++17, both fail.

Upvotes: 2

Views: 346

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626893

The regular expressions you wrote are ECMAScript compatible, but you selected the std::regex::extended flavor, which is POSIX ERE.

In a POSIX ERE pattern, you cannot use regex escape sequences. For example, you cannot put \] inside a bracket expression and expect it will match a literal ]. In fact, it will close the bracket expression prematuarely. The ^[^\(\[\)\],]+$ regex must be written as ^[^][(),]+$ as the ] that is at the beginning of a bracket expression is treated as a literal ] char (this is called smart placement, - must be used at the end of a bracket expression, by the way).

The easiest fix here though is to remove the std::regex::extended option and use the default ECMAScript one:

std::regex idRegex{ _csIdPattern };
std::regex versionRegex{ _csVersionPattern };

Upvotes: 2

Related Questions