Laurent
Laurent

Reputation: 49

Find the first set of 5 digits in a text

I need to find the first set of 5 numbers in a text like this :

;SUPER U CHARLY SUR MARNE;;;rte de Pavant CHARLY SUR MARNE Picardie 02310;Charly-sur-Marne;;;02310;;;;;;;;;;;;;;

I need to find the first 02310 only.

My regex but it found all set of 5 numbers :

([^\d]|^)\d{5}([^\d]|$)

Upvotes: 3

Views: 1235

Answers (3)

The fourth bird
The fourth bird

Reputation: 163217

To find the first 5 digits in the text, you could also match not a digit \D* or 1-4 digits followed by matching 5 digits:

^(?=.*\b\d{5}\b)(?:\D*|\d{1,4})*\K\d{5}(?!\d)
  • ^ Start of string
  • (?=.*\b\d{5}\b) Assert that there are 5 consecutive digits between word boundaries
  • (?:\D*|\d{1,4})* Repeat matching 0+ times not a digit or 1-4 digits
  • \K\d{5} Forget what was matched, then match 5 digits
  • (?!\d) Assert what followed is not a digit

Regex demo

Upvotes: 0

Qwertiy
Qwertiy

Reputation: 21380

I would've use

(^(?:(?!\d{5}).)+)(\d{5})(?!\d)

It finds fragment from beginning of the string till end of first 5-digit number, but in case of replacement you can use $1 or $2 to substitute corresponding part. For example replacement $1<$2> will surround number by < and >.

Upvotes: 0

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626738

To match the first 5-digit number you may use

^.*?\K(?<!\d)\d{5}(?!\d)

See the regex demo. As you want to remove the match, simply keep the Replace With field blank. The ^ matches the start of a line, .*? matches any 0+ chars other than line break chars, as few as possible, and \K operator drops the text matched so far. Then, (?<!\d)\d{5}(?!\d) matches 5 digits not enclosed with other digits.

Another variation includes a capturing group/backreference:

Find What:      ^(.*?)(?<!\d)\d{5}(?!\d)
Replace With: $1

See this regex demo.

Here, instead of dropping the found text before the number, (.*?) is captured into Group 1 and $1 in the replacement pattern puts it back.

Upvotes: 1

Related Questions