Morvael
Morvael

Reputation: 3567

Regular Expression find space delimited numbers

I have a string that comes from user input through a messaging system, this can contain a series of 4 digit numbers, but as users are likely to type things in wrong it needs to be a little bit flexible. Therefore I want to allow them to type in the numbers, or pepper their message with any string of characters and then just take the numbers that match the formats

=nnnn or nnnn

For this I have the Regular Expression:

(^|=|\s)\d{4}(\s|$)

Which almost works, however as it says that each group of 4 digits must start with an =, a space, or the start of the string it misses every other set of numbers

I tried this:

(^|=|\s*)\d{4}(\s|$)

But that means that any four digits followed by a space get matched - which is incorrect.

How can I match groups of numbers, but include a single space at the end of one group, and the beginning of the next, to clarify this string:

Ack 9876 3456 3467 4578 4567

Should produce the matches:

9876
3456 
3467 
4578 
4567

Upvotes: 1

Views: 5199

Answers (2)

vks
vks

Reputation: 67978

\b\d+\b

\b asserts position at a word boundary (^\w|\w$|\W\w|\w\W). It is a 0-width anchor, much like ^ and $. It doesn't consume any characters.

Demo

or

(?:^|(?<=[=\s]))\d{4}\b

Demo

Upvotes: 1

Avinash Raj
Avinash Raj

Reputation: 174786

Here you need to use lookarounds which won't consume any characters.

(?:^|[=\s])\K\d{4}(?=\s|$)

OR

(?:^|[=\s])(\d{4})(?=\s|$)

DEMO

Your regex (^|=|\s)\d{4}(\s|$) fails because at first this would match <space>9876<space> then it would look for another space or equals or start of the line. So now it finds the next match at <space>3467<space>. It won't match 3456 because the space before 3456 was already consumed in the first match. In-order to do overlapping matches, you need to put the pattern inside positive lookarounds. So when you put the last pattern (\s|$) inside lookahead, it won't consume the space, it just asserts that the match must be followed by a space or end of the line boundary.

Upvotes: 2

Related Questions