Reputation: 33
I am trying to split a string in JS on spaces except when the space is in a quote. However, an incomplete quote should be maintained. I'm not skilled in regex wizardry, and have been using the below regex:
var list = text.match(/[^\s"]+|"([^"]*)"/g)
However, if I provide input like sdfj "sdfjjk
this will become ["sdfj","sdfjjk"]
rather than ["sdfj",""sdfjjk"]
.
Upvotes: 1
Views: 255
Reputation: 626903
You can use
var re = /"([^"]*)"|\S+/g;
By using \S
(=[^\s]
) we just drop the "
from the negated character class.
By placing the "([^"]*)"
pattern before \S+
, we make sure substrings in quotes are not torn if they come before. This should work if the string contains well-paired quoted substrings and the last is unpaired.
Demo:
var re = /"([^"]*)"|\S+/g;
var str = 'sdfj "sdfjjk';
document.body.innerHTML = JSON.stringify(str.match(re));
Note that to get the captured texts in-between quotes, you will need to use RegExp#exec
in a loop (as String#match
"drops" submatches).
No idea what downvoter thought when downvoting, but let me guess. The quotes are usually used around word characters. If there is a "wild" quote, it is still a quote right before/after a word.
So, we can utilize word boundaries like this:
"\b[^"]*\b"|\S+
See regex demo.
Here, "\b[^"]*\b"
matches a "
that is followed by a word character, then matches zero or more characters other than "
and then is followed with a "
that is preceded with a word character.
Moving further in this direction, we can make it as far as:
\B"\b[^"\n]*\b"\B|\S+
With \B"
we require that "
should be preceded with a non-word character, and "\B
should be followed with a non-word character.
A lot depends on what specific issue you have with your specific input!
Upvotes: 1
Reputation:
Try the following:
text.match(/".*?"|[^\s]+/g).map(s => s.replace(/^"(.*)"$/, "$1"))
This repeatedly finds either properly quoted substrings (first), OR other sequences of non-whitespace. The map
part is to remove the quotes around the quoted substrings.
> text = 'abc "def ghi" lmn "opq'
< ["abc", "def ghi", "lmn", ""opq"]
Upvotes: 0