mrbranden
mrbranden

Reputation: 1060

Regex: match only first instance of a pattern

Using a regex for a string, we need to remove all text before the first instance of four digits in a row. We have a regex that "sort of" works:

^((?!\d{4}\w).)*

Given this string: foo-bar-spring_06-2006_02_25.rm the desired output is: 2006_02_25.rm

That works - if there's only one instance of a four-digit pattern. The string: batt-fall_01-2001-11-10_0200-0400.rm produces this result: 0400.rm

It should produce: 2001-11-10_0200-0400.rm

Note: long story, but we cannot use a - or _ as a delimiter.

I feel like we're close. Does anyone have any suggestions?

Thanks!

Upvotes: 1

Views: 55

Answers (2)

blhsing
blhsing

Reputation: 107095

You can use a positive lookahead pattern after a lazily repeated . instead:

^.*?(?=\d{4})

Demo: https://regex101.com/r/8DZDQp/1

Alternatively, you can group the 4 digits:

^.*?(\d{4})

and substitute the match with the first group $1.

Demo: https://regex101.com/r/8DZDQp/3

Upvotes: 1

Emma
Emma

Reputation: 27743

A likely faster option would be to ignore the beginning and undesired part, without using lookarounds, and with a simple expression similar to:

(\d{4}.*\..+)$

or:

(\d{4}.*\.[a-z]+)$

End $ anchor is also unnecessary, without which it would still work.

Demo

Upvotes: 0

Related Questions