Reputation: 8151
I'm looking for the regex that will match all white space in a string except when it is between quotes.
For example, if I have the following string:
abc def " gh i " jkl " m n o p " qrst
- -- -- - -- - --
I want to match the spaces that have a dash under them. The dashes are not part of the string, only for illustration purposes.
Can this be done?
Upvotes: 2
Views: 2436
Reputation: 174696
You could try the below positive lookahead based regex.
\s(?=(?:"[^"]*"|[^"])*$)
or
(?=(?:"[^"]*"|[^"])*$)
Explanation:
\s
Matches a space character
(?=(?:"[^"]*"|[^"])*$)
only if it's followed by,
"[^"]*"
double quotes plus [^"]*
any character not of double quotes zero or more times plus a closing double quotes. So it matches the double quotes block ie, like "foo"
or "ljilcjljfcl"
|
OR If the following character is not of a double quotes, then the control switches to the pattern next to the |
or part ie, [^"]
.
[^"]
Matches any character but not of a double quotes.
Take foo "foo bar" buz
as an example string.
foo "foo bar" buz
\s
at first matches all the spaces. Then it checks the condition that the matched spaces must be followed by double quoted string or [^"]
zero or more times. So it checks that the first space if followed by a double quoted string or not. Yes, the first space if followed by a double quoted string "foo bar"
, then the character following the double quoted string is a space. Now the regex "[^"]*"
got failed and the control switches to the next part ie,
[^"]
. This pattern matches the following space. Because *
applies to that pattern [^"]*
matches all the following characters. Finally the condition is satisfied for the first space, so it got matched.
Upvotes: 7
Reputation: 12389
If your regex flavor is PCRE could (*SKIP)(*F) the quoted stuff or replace one or more \s
"[^"]*"(*SKIP)(*F)|\s+
Upvotes: 2
Reputation: 67968
[ ](?=(?:[^"]*"[^"]*")*[^"]*$)
Try this.See demo.
https://regex101.com/r/pM9yO9/7
This basically states that find any space
which has groups of ""
in front of it but not an alone "
.It is enforced through lookahead
.
Upvotes: 4