Hybrid
Hybrid

Reputation: 7049

JS Regex for Capturing Last Set of Parenthesis (Excluding Nested)

So I have found a few articles that talk about capturing what's inside a set of parenthesis, but can't seem to find one that specifically ignores nested parenthesis. Also, I would like to only capture the last set.

So in essence there are three rules:

  1. Capture text INSIDE of parenthesis
  2. Capture the content of only the LAST parenthesis
  3. Capture the content inside ONLY ONE SET of parenthesis (do not touch nesting)

Here are the 3 examples:

What would be the correct way of programming this in JS/jQuery (.match(), .exec())?

Upvotes: 0

Views: 186

Answers (2)

Bram Vanroy
Bram Vanroy

Reputation: 28437

https://regex101.com/r/UOFxWC/2

var strings = [
  'Pokemon Blue Version (Gameboy Color)',
  'Pokemon (International) (Gameboy Color)',
  'Pokemon Go (iPhone (7))'
];

strings.forEach(function(string) {
  var re = /\(([^)]+\)?)\)(?!.*\([^)]+\))/ig;
  var results = re.exec(string);
  console.log(results.pop());
});

Alternatively, you can parse the string yourself. The idea is to start from the back, each time you see ) add one to depth, subtract one if you see (. When depth is > 0, prepend the current character to a temporary string. Because you only want the final group, we can bail out (break) as soon as we have a full match, i.e. the sub string exists, and depth is back to zero. Note that this will not work with broken data: when the groups are not balanced you'll get odd results. so you have to make sure your data is correct.

var strings = [
  'Pokemon Blue Version (Gameboy Color)',
  'Pokemon (International) (Gameboy Color)',
  'Pokemon Go (iPhone (7))',
  'Pokemon Go (iPhon(e) (7))',
  'Pokemon Go ( iPhone ((7)) )'
];

strings.forEach(function(string) {
  var chars = string.split('');
  var tempString = '';
  var depth = 0;
  var char;
  while (char = chars.pop()) {
    if (char == '\(') {
      depth--;
    }
    if (depth > 0) {
      tempString = char + tempString;
    }
    if (char == '\)') {
      depth++;
    }

    if (tempString != '' && depth === 0) break;
  }
  console.log(tempString);
});

Upvotes: 3

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89547

This is what I described in comments, feel free to define the behaviour you want when parenthesis are not balanced (if needed):

function lastparens(str) {
    var count = 0;
    var start_index = false;
    var candidate = '';

    for (var i = 0, l = str.length; i < l; i++) {
        var char = str.charAt(i);

        if (char == "(") {
            if (count == 0) start_index = i;
            count++;
        } else if (char == ")") {
            count--;

            if (count == 0 && start_index !== false)
                candidate = str.substr (start_index, i+1);

            if (count < 0 || start_index === false) {
                count = 0;
                start_index = false;
            }
        }
    }
    return candidate;
}

test cases:

var arr = [ 'Pokemon Blue Version (Gameboy Color)',
            'Pokemon (International) (Gameboy Color)',
            'Pokemon Go (iPhone (7))',

            'Pokemon Go ( iPhon(e) (7) )',
            'Pokemon Go ( iPhone ((7)) )',
            'Pokemon Go (iPhone (7)' ];

arr.forEach(function (elt, ind) {
    console.log( elt + ' => ' + lastparens(elt) );
} );

demo

Upvotes: 1

Related Questions