Graham
Graham

Reputation: 8151

Regex expression for all white space except when it is in quotes

I'm looking for the regex that will match all white space in a string except when it is between quotes.

For example, if I have the following string:

 abc  def  " gh i " jkl  " m n o p " qrst  
-   --   --        -   --           -    --

I want to match the spaces that have a dash under them. The dashes are not part of the string, only for illustration purposes.

Can this be done?

Upvotes: 2

Views: 2436

Answers (3)

Avinash Raj
Avinash Raj

Reputation: 174696

You could try the below positive lookahead based regex.

\s(?=(?:"[^"]*"|[^"])*$)

or

 (?=(?:"[^"]*"|[^"])*$)

DEMO

Explanation:

  • \s Matches a space character

  • (?=(?:"[^"]*"|[^"])*$) only if it's followed by,

    1. "[^"]*" double quotes plus [^"]* any character not of double quotes zero or more times plus a closing double quotes. So it matches the double quotes block ie, like "foo" or "ljilcjljfcl"

    2. | OR If the following character is not of a double quotes, then the control switches to the pattern next to the | or part ie, [^"].

    3. [^"] Matches any character but not of a double quotes.

Take foo "foo bar" buz as an example string.

foo "foo bar" buz             

\s at first matches all the spaces. Then it checks the condition that the matched spaces must be followed by double quoted string or [^"] zero or more times. So it checks that the first space if followed by a double quoted string or not. Yes, the first space if followed by a double quoted string "foo bar", then the character following the double quoted string is a space. Now the regex "[^"]*" got failed and the control switches to the next part ie, [^"]. This pattern matches the following space. Because * applies to that pattern [^"]* matches all the following characters. Finally the condition is satisfied for the first space, so it got matched.

Upvotes: 7

Jonny 5
Jonny 5

Reputation: 12389

If your regex flavor is PCRE could (*SKIP)(*F) the quoted stuff or replace one or more \s

"[^"]*"(*SKIP)(*F)|\s+

Test at regex101.com

Upvotes: 2

vks
vks

Reputation: 67968

[ ](?=(?:[^"]*"[^"]*")*[^"]*$)

Try this.See demo.

https://regex101.com/r/pM9yO9/7

This basically states that find any space which has groups of "" in front of it but not an alone ".It is enforced through lookahead.

Upvotes: 4

Related Questions