Alex Wang
Alex Wang

Reputation: 481

JavaScript split by space ignoring parentheses

I'm trying to split a string by space but ignore those in parentheses, or after an opening parenthesis. I followed this solution but my case is a little bit more complicated. For example if the parentheses are balanced, that solution works fine:

// original string
let string = 'attribute1 in (a, b, c) attribute2 in (d, e)';
words = string.split(/(?!\(.*)\s(?![^(]*?\))/g);
console.log(words)

expected result after split:

words = ['attribute1', 'in', '(a, b, c)', 'attribute2', 'in', '(d, e)']

However if the parentheses are not balanced, let's say:

// original string
let string = 'attribute1 in (a, b, c) attribute2 in (d, e';

Then the result I expected should be:

['attribute1', 'in', '(a, b, c)', 'attribute2', 'in', '(d, e']

instead of

['attribute1', 'in', '(a, b, c)', 'attribute2', 'in', '(d,', 'e']

How should I achieve this?

Upvotes: 3

Views: 170

Answers (1)

Kipras Melnikovas
Kipras Melnikovas

Reputation: 419

We can balance the string out by adding the missing parentheses at the end.

Note that a situation like

"attribute1 in (a, b, c attribute2 in (d, e"

would result in

[ 'attribute1', 'in', '(a,', 'b,', 'c', 'attribute2', 'in', '(d, e' ]

and the solution assumes this is the expected outcome.

If yes - here's the solution:

/**
 * @param {string} s
 * @returns {string[]}
 */
function split(s) {
  let unclosed_count = 0;

  // count unclosed parentheses
  for (let i = 0; i < string.length; i++) {
    if (s[i] == '(') {
      unclosed_count++;
    } else if (s[i] == ')') {
      unclosed_count--;
    }
  }

  // close off the parentheses
  for (let i = 0; i < unclosed_count; i++) {
    s += ')';
  }

  // split
  let words = s.split(/(?!\(.*)\s(?![^(]*?\))/g);

  // remove the added parentheses from the last item
  let li = words.length - 1;
  words[li] = words[li].slice(0, -unclosed_count);

  return words;
}

let string = 'attribute1 in (a, b, c) attribute2 in (d, e';
let words = split(string);

console.log(words);
// => [ 'attribute1', 'in', '(a, b, c)', 'attribute2', 'in', '(d, e' ]

cheers!


also worth considering a case where instead of the opening brackets ( being unmatched, there would exist some closing brackets ) that are unmatched aswell.

i.e. "attribute1 in a, b, c) attribute2 in d, e)"

this was not mentioned in the problem so it's not inside the solution either, but in case this matters, you'd want to do the same thing we did with unclosed_count, but reverse, with i.e. unopened_count.

Upvotes: 3

Related Questions