Reputation: 7481
I'm trying to match various SQL-like expression in my company's code.
We have two type of INSERT:
1) InsertInto("TABLE").Values("FIELD_1", "VALUE_1", "FIELD_2", "VALUE_2").Execute()
in this case we always have an even number of arguments for the Values
function call
2) InsertInto("TABLE").Values("FIELD_1", "FIELD_2", selectExpression).Execute()
where in turn selectExpression
is a variable containing a SELECT query,
here there is not a constraint on number of arguments
I'm using the following regex (simplified) to match the Values
statement of the 1st case:
Values\(((?<insertfield>"\w+"),\s*(?<insertvalue>(\w|[ .()])+),?\s*)+\)
Unexpectedly, it also matches a 2nd case with odd numer of arguments, like the one above.
https://regex101.com/r/YF5f9i/1
I completely don't understand how that's possible because "FIELD_1",
seems not being matched at all:
Upvotes: 2
Views: 59
Reputation: 1904
It matches. It just matches another time afterwards. Regex101 only shows the last matches
Values\(((?<insertfield>"\w+"),\s*(?<insertvalue>(\w|[ .()])+),?\s*)+\)
on InsertInto("TABLE").Values("FIELD_1", "FIELD_2", selectExpression).Execute()
leads to the core
((?<insertfield>"\w+"),\s*(?<insertvalue>(\w|[ .()])+),?\s*)+
on "FIELD_1", "FIELD_2", selectExpression).Execute(
(with a ^
in the beginning and $
in the end if you want to be very clear)
To simplify: (\w|[ .()])+
is the same as [\w .()]+
((?<insertfield>"\w+"),\s*(?<insertvalue>[\w .()]+),?\s*)+
fits on "FIELD_1",
(with a space afterwards) as well as "FIELD_2", selectExpression).Execute(
This means, that the unnamed group(in your example "Group 1") has 2 captures (?<unnamedGroup>(?<insertfield>"\w+"),\s*(?<insertvalue>[\w .()]+),?\s*)+
"Field_1",
"FIELD_2", selectExpression).Execute(
And as Regex102 only displays the last capture, it displays "FIELD_2", selectExpression).Execute(
This took me some nerves to find out...
Upvotes: 1