Reputation: 6949
I need some help to improve a regex! In JavaScript I have a regular expression which looks for pairs of numbers in a filename
var nums = str.match(/[\d]{1,}[\d]{1,}/gi);
This will match
with (1200,627) I have tried to improve the reg ex, just incase there are more than two pairs of numbers, to look for the following number(1 digit or more) + whitspace(1 or more) + x (zero or once) + whitspace(1 or more) + number(1 digit or more)
Which should fail on the second example (using a 'y' instead on an 'x'), which I thought would be:
[\d]{1,}[\s]?[x]?[\s]?[\d]{1,}
but it grabs all the digits in
with (1200,627,01) whereas I only want the first two numbers. I've written the code to deal only with the first two, but I was wondering where I was going wrong. Only a level 17 regex wizard can save me now! Thanks
Upvotes: 2
Views: 53
Reputation: 9211
You say you want "one or more" whitespace characters between the "x", but you have used the ?
quantifier which means "zero or one". Thus, because you've also marked the "x" as optional, it will match any two-or-more digit number: Your first [\d]{1,}
will match against 0
then your second one will match on 1
.
Note that you do not need to enclose single atoms into a character range: [\d]
can be more simply written as \d
. Also {1,}
-- meaning "one or more" -- is more easily encoded as +
.
As you want "one or more" whitespace character on either side of the "x", I would go with:
\d+(?:(?:\s+x\s+)|\s+)\d+
Note that (?: ... )
is a "non-capture group", so these bits won't form part of your match array. However, I don't think you want "one or more" whitespace character, as that won't match your first example. Instead, try this:
\d+(?:(?:\s*x\s*)|\s+)\d+
Where the *
quantifier means "zero-or-more".
Upvotes: 0
Reputation: 757
I used \d+\s?x?\s?\d+
as my regex (same thing just replacing +
for {1,}
and removing the unnecessary []
). You can see the outcome of it here.
The reason it's matching the 01
is because of all the ?
. So it's matching the first /d+
(1 digit: 0
), and then 0 of \s
, 0 of x
, and 0 of \s
followed by \d+
(another 1 digit: 1
)
The regex
(\d+)(?:\s?x\s?|\s)(\d+)
should do the trick. Test it here
(?:...)
is a non-capture group. So it allows alternation while not assigning a back reference to it. This part matches the characters in between the two numbers (either has an x
or a <space>
).
Upvotes: 1