Seph Reed
Seph Reed

Reputation: 10918

Regex match the space bewtween matches as well?

I'm not new to regex (or SO), but I can't seem to find a solid solution for matching the leftover spaces between matches.

For instance, I want to know what is inside quotes, and what is not, and do things to both.

Getting quotes is easy: (\".+?\"|'.+?') = quoteMatch

but making another match group to select everything else is not.

The closest I've gotten is quoteMatch+'|(.)'. This will separate my quote groups from my everything else groups, but it doesn't group together the 'else groups.

Trying quoteMatch+'|(.+)' selects everything together and quoteMatch+'|(.+?)' puts me back a step.

I imagine I need to find a way to make the first match more greedy than the second, but anything I do to make it greedy makes it start taking over multiple quotes and the things in between (ie. match = "quote1" things in between "quote2".

I've also looked into using the split function, but it doesn't return what the split was, and is not quite as eloquent of a solution as I imagine must exist.

Thank you for any help.

Upvotes: 1

Views: 33

Answers (1)

Josh Crozier
Josh Crozier

Reputation: 240878

Move the match for selecting the other character to the inside of the capturing group as an alternation:

(\".+?\"|'.+?'|.+?(?=["']|$))

Then you can use a positive lookahead such as (?=["']|$) in order to match until a quote or the end of the line.

Live Example

In doing so, an input of:

before quotes "quote1" in between quotes "quote2" after quotes

Would return:

(before quotes ), ("quote1"), ( in between quotes ), ("quote2"), ( after quotes)

As a side note, you can also combine the first two alternations by using a backreference to close the quote:

((['"]).+?\2|.+?(?=["']|$))

Upvotes: 1

Related Questions