KorHosik
KorHosik

Reputation: 1257

JavaScript and regular expressions: get the number of parenthesized subpattern

I have to get the number of parenthesized substring matches in a regular expression:

var reg=/([A-Z]+?)(?:[a-z]*)(?:\([1-3]|[7-9]\))*([1-9]+)/g,
nbr=0;

//Some code

alert(nbr); //2

In the above example, the total is 2: only the first and the last couple of parentheses will create grouping matches.

How to know this number for any regular expressions?

My first idea was to check the value of RegExp.$1 to RegExp.$9, but even if there are no corresponding parenthseses, these values are not null, but empty string...

I've also seen the RegExp.lastMatch property, but this one represents only the value of the last matched characters, not the corresponding number.

So, I've tried to build another regular expression to scan any RegExp and count this number, but it's quite difficult...

Do you have a better solution to do that?

Thanks in advance!

Upvotes: 0

Views: 400

Answers (2)

Sheepy
Sheepy

Reputation: 18005

Well, judging from the code snippet we can assume that the input pattern is always a valid regular expression, because otherwise it would fail before the some code partm right? That makes the task much easier!

Because We just need to count how many starting capturing parentheses there are!

var reg = /([A-Z]+?)(?:[a-z]*)(?:\([1-3]|[7-9]\))*([1-9]+)/g;

var nbr = (' '+reg.source).match(/[^\\](\\\\)*(?=\([^?])/g);
nbr = nbr ? nbr.length : 0;

alert(nbr); // 2

And here is a breakdown:

  • [^\\] Make sure we don't start the match with an escaping slash.
  • (\\\\)* And we can have any number of escaped slash before the starting parenthes.
  • (?= Look ahead. More on this later.
    • \( The starting parenthes we are looking for.
    • [^?] Make sure it is not followed by a question mark - which means it is capturing.
  • ) End of look ahead

Why match with look ahead? To check that the parenthes is not an escaped entity, we need to capture what goes before it. No big deal here. We know JS doens't have look behind.

Problem is, if there are two starting parentheses sticking together, then once we capture the first parenthes the second parenthes would have nothing to back it up - its back has already been captured! So to make sure a parenthes can be the starting base of the next one, we need to exclude it from the match.

And the space added to the source? It is there to be the back of the first character, in case it is a starting parenthes.

Upvotes: 1

jAndy
jAndy

Reputation: 235982

Javascripts RegExp.match() method returns an Array of matches. You might just want to check the length of that result array.

var mystr = "Hello 42 world. This 11 is a string 105 with some 2 numbers 55";
var res = mystr.match(/\d+/g);

console.log( res.length );

Upvotes: 2

Related Questions