Reputation: 1427
I've got an expression held in a JS string and I want to split it into tokens. The string could contain any symbols or characters (its actually a string expression)
I've been using
expr.split(/([^\"]\S*|\".+?\")\s*/)
But when I get a text symbol outside of quotes it splits it wrongly.
e.g. When
expr = "Tree = \"\" Or Tree = \"hello cruel world\" + \" and xyz\""
Then The OR gets mixed in with the following string.
Splitting on \b seems to be the way to go (is it?) but I don't know how to keep the strings in quotes together. So ideally in the above I'd get:
Tree
=
\"\"
Or
Tree
=
\"Hello cruel world\"
+
\" and xyz\"
I suppose ideally I would find a tokenizer but if I could do it in regex that would be a major headache solved :)
thanks
Upvotes: 0
Views: 2335
Reputation: 240938
A simpler approach is to use .match()
instead of .split()
and match the characters between the quotes or groups of non-whitespace characters using an alternation:
/"[^"]+"|\S+/g
Explanation:
"[^"]+"
- Match one or more non-"
characters between the double quotes..|
- Alternation\S+
- ...or match groups of one or more non-whitespace charactersUsage:
var string = 'Tree = \"\" Or Tree = \"hello cruel world\" + \" and xyz\"';
var result = string.match(/"[^"]+"|\S+/g);
document.querySelector('pre').textContent = JSON.stringify(result, null, 4);
<pre></pre>
Upvotes: 1