Sean Phillips
Sean Phillips

Reputation: 133

RegExp not working as expected capturing strings within parens

I am working on a problem where I get a string that has several sets of parenthesis in it.

ex. "(024,025,026,027,028),(029,030,031,032,033)"

At the time I get the string I have a variable holding a string say "030". I create a regular expression as such:

var re = new RegExp(".*?\\((.*?" + id + ".*?)\\).*");

and do a replace as follows:

string.replace(re, "$1");

So the problem is that if the number falls within the first set of parenthesis then it will work properly but if it falls within the second it will not.

so:

var id = "024";   
var re = new RegExp(".*?\\((.*?" + id + ".*?)\\).*");
var string = "(024,025,026,027,028),(029,030,031,032,033)";
document.writeln(string.replace(re, "$1"));

Returns "024,025,026,027,028"

but:

var id = "029";   
var re = new RegExp(".*?\\((.*?" + id + ".*?)\\).*");
var string = "(024,025,026,027,028),(029,030,031,032,033)";
document.writeln(string.replace(re, "$1"));

Returns: "024,025,026,027,028),(029,030,031,032,033"

I am specifying the ? to minimize what is captured between the parens - but it does not seem to work. Can someone explain what I am missing?

Here is a JSFiddle http://jsfiddle.net/rdwAP/#&togetherjs=xVQ7Ltd8rO

Upvotes: 1

Views: 90

Answers (2)

Denys Séguret
Denys Séguret

Reputation: 382170

When you want to extract data from a string, it's usually a good idea

  • to avoid vague, catch-all, or negated patterns
  • to use match instead of replace

It's cleaner, more explicit, and faster.

Here you should use (\d+,)* and (,\d+)* instead of just .*?.

var id = "029";   
var re = new RegExp("\\(((\\d+,)*" + id + "(,\\d+)*)\\)");
var string = "(024,025,026,027,028),(029,030,031,032,033)";
document.writeln(string.match(re)[1]);

Note that this explicit regex will also fail in case of garbage input, which is generally considered a plus.

Upvotes: 1

anubhava
anubhava

Reputation: 785276

Don't use .*? in your regex, use negated pattern instead:

var id = "029";   
var re = new RegExp("\\(([^)]*" + id + "[^)]*)\\)");
var string = "(024,025,026,027,028),(029,030,031,032,033)";
string.replace(re, "$1");
//=> "(024,025,026,027,028),029,030,031,032,033"
  • ([^)]* will match 0 or more of any character that is not ) thus stoping your match when it gets ).
  • Whereas when you have .*? it will match until it finds id variable which exists in 2nd set of (...).

Upvotes: 1

Related Questions