Reputation: 547
I am sorry if the question title is a bit confusing, but here I will elaborate my confusion in detail.
I want to use regular expression to match apple, orange, mango, apple[(can have any number or empty)], orange[(can have any number or empty)]. (Notice mango will NOT have []). Here are some of the valid examples:
Here is the regular expression I come up with:
/^(mango|(apple|orange)(\[[1-9][0-9]*\])?)$
This regular expression works, but usually it gives more than 1 matching group. For example apple[15]
will give
1. apple[15]
2. apple[15]
3. [15]
Actually the behavior is normal as I have many ()
which creates many groups, but I wonder if I am using the right way to construct this regular expression? Because it just gives too many results for a single match.
Moreover, is there any way I can optimize this regular expression? This regular expression is fairly straightforward but it seems it is complicated.
Thank you.
Upvotes: 1
Views: 182
Reputation: 5564
It's matching those sub-groups because that's what ()
does. If you want to group items together without matching them to output, use non-capturing groups (?:)
. For example: (?:apple|orange)
would match apple or orange, but would not capture the group to output.
If you want to capture the entire match only without subgroups, do the following:
^mango$|^(?:apple|orange)(?:\[(?:[1-9][0-9]*)?\])?$
var strArr = [ 'apple',
'orange',
'apple[]',
'orange[]',
'apple[15]',
'apple[05]',
'mango[]',
'mango' ];
var re = /^mango$|^(?:apple|orange)(?:\[(?:[1-9][0-9]*)?\])?$/;
strArr.forEach(function(str) {
document.body.insertAdjacentHTML('beforeend', str + ' - match? ' + re.test(str) + '<br>');
});
Railroad Diagram:
Upvotes: 1
Reputation: 745
In your regular expression you are declaring (G1|(G2)(G3)). This is why when you match you get an array with four values:
1. apple[15] The whole match
2. apple[15] G1 (mango|(apple|orange)(\[1-9][0-9]*\])?)
3. apple G2 (apple|orange)
4. [15] G3 (\[[1-9][0-9]*\])?
If you altered the regular expression to be /^(mango)|(apple|orange)(\[[1-9][0-9]*\])?$/
you will get the same result, except #2 from above will be undefined unless you have mango
as the input parameter. Note that the expression will still accept mango[123]
, but the match will not include the number.
Upvotes: 0