Reputation: 14109
I want to be able to separate a string into values by splitting by spaces, but if something is in parentheses I need it to be in a single value. So for example, (a b c) d e (f g) h
should become ['a b c', 'd', 'e', 'f g', 'h']
. What's a regex that will do that for me?
Upvotes: 1
Views: 117
Reputation: 18950
Right, the standard JavaScript regex engine cannot handle nested patterns. If you use Perl, PHP or .NET you can do it with a pattern like this:
(?(DEFINE)
(?<open>\()
(?<close>\))
(?<val>(?&open)|(\w\s?)+)
(?<start>(?&open)(?&val)(?&close))
)
(?&start)|(?<=\s)\w
It can be done in JavaScript too using an extended JavaScript regular expressions library like XRegExp. Here is a sample, to give you the idea:
const str1 = '(a b c) d e (f g) h';
var s = XRegExp.matchRecursive(str1, '\\(', '\\)', 'g');
console.log(s);
// -> ['a b c', 'f g']
Upvotes: 0
Reputation: 4372
As mentioned in the comments, dealing with nesting in regular expressions is impossible, so this is a code that deals with your problem; it uses regular expressions and other techniques:
var str = '(a (b) c) d e (f g) h';
var match;
var myRe = /\([^]+?\)|\S+/g;
var result = [];
while (match = myRe.exec(str)) {
result.push(match[0]);
}
var tmp = "";
var final = [];
for (var i = 0; i < result.length; i++) {
var leftP = (result[i].match(/\(/g) || []).length;
var rightP = (result[i].match(/\)/g) || []).length;
if (leftP !== rightP) {
tmp += result[i];
for (var j = i + 1; j < result.length; j++) {
tmp += result[j];
if ((tmp.match(/\(/g) || []).length === (tmp.match(/\)/g) || []).length) {
final.push(tmp);
tmp = "";
i = j + 1;
break;
}
}
} else {
final.push(result[i]);
}
}
for (var i = 0; i < final.length; i++) {
final[i] = final[i].replace(/\)(\S+)/g, ') $1');
}
for (var i = 0; i < final.length; i++) {
final[i] = final[i].replace(/^\(([^]+)\)$/, '$1');
}
It might be not optimized but I think it solves your problem.
Upvotes: 2