ADE
ADE

Reputation: 85

Regex not returning all matches

I have the following regex (my actual regex is actually a lot more complex but I pinned down my problem to this): \s(?<number>123|456)\s

And the following test data:

" 123 456 "

As expected/wanted result I would have the regex match in 2 matches one with "number" being "123" and the second with number being "456". However, I'm only getting 1 match with "number" being "123".

I did notice that adding another space in between "123" en "456" in the test data does give 2 matches...

Why don't I get the result I want? How to get it right?

Upvotes: 3

Views: 1108

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626699

Your pattern contains consuming \s patterns that matches a whitespace before and after a number, and the input contains consecutive numbers separated with a single whitespace. If there were two spaces between the numbers, it would work.

Use whitespace boundaries based on lookarounds:

(?<!\S)(?<number>123|456)(?!\S)

See the regex demo

The (?<!\S) is a negative lookbehind that will fail the match if there is a non-whitespace char immediately to the left of the current location, and (?!\S) is a negative lookahead that will fail the match if there is a non-whitespace char immediately to the right of the current location.

(?<!\S) is the same as (?<=^|\s) and (?!\S) is the same as (?=$|\s), but more efficient.

Note that in many situations you might even go with 1 lookahead and use

\s(?<number>123|456)(?!\S)

It will ensure the consecutive whitespace separated matches are found.

Upvotes: 3

Related Questions