Reputation: 10918
I'm not new to regex (or SO), but I can't seem to find a solid solution for matching the leftover spaces between matches.
For instance, I want to know what is inside quotes, and what is not, and do things to both.
Getting quotes is easy: (\".+?\"|'.+?') = quoteMatch
but making another match group to select everything else is not.
The closest I've gotten is quoteMatch+'|(.)'
. This will separate my quote groups from my everything else groups, but it doesn't group together the 'else groups.
Trying quoteMatch+'|(.+)'
selects everything together and quoteMatch+'|(.+?)'
puts me back a step.
I imagine I need to find a way to make the first match more greedy than the second, but anything I do to make it greedy makes it start taking over multiple quotes and the things in between (ie. match = "quote1" things in between "quote2"
.
I've also looked into using the split function, but it doesn't return what the split was, and is not quite as eloquent of a solution as I imagine must exist.
Thank you for any help.
Upvotes: 1
Views: 33
Reputation: 240878
Move the match for selecting the other character to the inside of the capturing group as an alternation:
(\".+?\"|'.+?'|.+?(?=["']|$))
Then you can use a positive lookahead such as (?=["']|$)
in order to match until a quote or the end of the line.
In doing so, an input of:
before quotes "quote1" in between quotes "quote2" after quotes
Would return:
(before quotes ), ("quote1"), ( in between quotes ), ("quote2"), ( after quotes)
As a side note, you can also combine the first two alternations by using a backreference to close the quote:
((['"]).+?\2|.+?(?=["']|$))
Upvotes: 1