jCuga
jCuga

Reputation: 1543

boost regex without possessive quantifiers

Is there a similar way to write this regex without using possessive quantifiers (ie ++ and *+ ?

boost::regex testing123("\"value\":\"((?:[^\\\"\\\\]++|\\\\.)*+)\"");

I think this is comparable(?):

boost::regex testing123("\"value\":\"(?>(?:(?>[^\\\"\\\\]+)|\\\\.)*)\"");

Update: It's trying to match quoted text--but inside the double quotes, there can be a number of inner, escaped quotes.

Upvotes: 0

Views: 384

Answers (2)

riwalk
riwalk

Reputation: 14233

I've found that it is a valuable skill to know how to write regular expressions using as few bells and whistles as possible:

"value":"([^\"]|\.)*"

What this is essentially saying is:

  • Match "value":" (the easy part)
  • Match zero or more occurances of:
    • Anything other than a \ or ", OR
    • Match a \, followed by zero or more \'s, followed by any non-\ character.
  • End the regex when matching the final "

This allows for any escape sequence, and it assumes that the backslash always distinguishes an escape sequence (meaning that \\" is not an escaped quote, but rather an escaped \ followed by the terminating quote).

Putting it into the same syntax that you had (by escaping special characters), we get:

boost::regex testing123("\"value\":\"([^\\\"]|\\.)*\"");

Always try to keep regular expressions simple.

Upvotes: 1

Vitus
Vitus

Reputation: 11922

Possessive quantifiers are just syntactic sugar for atomic grouping, i.e. (ab)*+ is equivalent to (?>(ab)*). Using this, you can rewrite your whole expression without using possessive quantifiers.

Upvotes: 2

Related Questions