Reputation: 10201
I'm trying capture 2 groups of numbers, where each group is optional and should only be captured if contains numbers. Here is a list of all valid combinations that it supposed to match:
123(456)
123
(456)
abc(456)
123(efg)
And these are not valid combinations and should not be matched:
abc(efg)
abc
(efg)
However, my regex fails on #4
and #5
combinations even though they contain numbers.
const list = ["123(456)", "123", "(456)", "abc(456)", "123(def)", "abc(def)", "abc", "(def)"];
const regex = /^(?:(\d+))?(?:\((\d+)\))?$/;
list.map((a,i) => console.log(i+1+". ", a + "=>".padStart(11-a.length," "), JSON.stringify((a.match(regex)||[]).slice(1))));
.as-console-wrapper{top:0;max-height:unset!important;overflow:auto!important;}
So, the question is why when used ?
behind a group, it doesn't "skip" that group if nothing matched?
P.S.
With this regex it also captures #4
, but not #5
: /(?:^|(\d+)?)(?:\((\d+)\))?$/
Upvotes: 1
Views: 497
Reputation: 18490
Here an idea without global flag and supposed to only match the needed items:
^(?=\D*\d)(\d+)?\D*(?:\((\d*)\))?\D*$
^(?=\D*\d)
The lookahead at ^
start checks for at least a digit(\d+)?
capturing the digits to the optional first group\D*
followed by any amount of non digits(?:\((\d*)\))?
digits in parentheses to optional second group\D*$
matching any amount of \D
non digits up to the $
endSee your JS demo or a demo at regex101 (the [^\d\n]
only for multiline demo)
Upvotes: 2
Reputation: 10201
@WiktorStribiżew and @akash had good ideas, but they are based on global flag, which requires additional loop to gather all the matches.
For now, I come up with this regex, which matches anything, but it captures only what I need.
const list = ["123(456)", "123", "(456)", "abc(456)", "123(def)", "abc(def)", "abc", "(def)"];
const regex = /(?:(\d+)|^|[^(]+)+?(?:\((?:(\d+)|\D*)\)|$)+?/;
list.map((a,i) => console.log(i+1+". ", a + "=>".padStart(11-a.length," "), JSON.stringify((a.match(regex)||[]).slice(1))));
.as-console-wrapper{top:0;max-height:unset!important;overflow:auto!important;}
Upvotes: 1
Reputation: 587
A solution to what you're looking for can be done with lookahead, see:
(?=^\d+(?:\(|$))(\d+)|(?=\d+\)$)(\d+)
Rough translation: a number from the start ending with a bracket (or end of line) OR a number in brackets somewhere in the text
To answer question on optional captured groups
Yes, if a group is marked optional e.g. (A*)?
it does make the whole group optional.
In your case, it is simply a case of the regex not matching - even if the optional part isn't there (verify with the help of a regex debugger)
Upvotes: 1