Elle
Elle

Reputation: 334

Recursive regular expression match with boost

I got a problem with C++ standard regex library not compiling recursive regex.

Looking up on the internet I found out it's a well known problem and people suggest using boost library. This is the incriminated one :

\\((?>[^()]|(?R))*\\)|\\w+

What I'm trying to do is basically using this regex to split statements according to spaces and brackets (including the case of balanced brackets inside brackets) but every piece of code showing how to do it using boost doesn't work properly and I don't know why. Thanks in advance.

Upvotes: 1

Views: 461

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626903

You may declare the regex using a raw string literal, using R"(...)" syntax. This way, you won't have to escape backslashes twice.

Cf., these are equal declarations:

std::string my_pattern("\\w+");
std::string my_pattern(R"(\w+)");

The parentheses are not part of the regex pattern, they are raw string literal delimiter parts.

However, your regex is not quite correct: you need to recurse only the first alternative and not the whole regex.

Here is the fix:

std::string my_pattern(R"((\((?:[^()]++|(?1))*\))|\w+)");

Here, (\((?:[^()]++|(?1))*\)) matches and 1+ chars other than ( and ) or recurses the whole Group 1 pattern with (?1) regex subroutine.

See the regex demo.

Upvotes: 1

Related Questions