Arvid
Arvid

Reputation: 133

Regex match everything but last four digits

I need a regex match for everything in a string except the last sequence of four digits.

abc12345 => abc1
abc1234abc => abcabc
abc123.45678abc => abc123.4abc
1234abc => abc
12345abc => 1abc

I have tried lots and lots of things. The closest I got is with

.*[^(\d{4})]

but when there are characters behind the last secquence of four this fails.

Upvotes: 1

Views: 7074

Answers (2)

Aleksandar Makragić
Aleksandar Makragić

Reputation: 1997

The only way you can glue together parts of regular expression is by utilizing power of back-references. You can learn more about back references here.

This code solves your problem and you can see regular expression demo here:

([a-z0-9.]*)\d{4}([a-z0-9]+)

Basically whenever you put something into normal brackets () that counts as back-reference, ([a-z0-9.]*) this would be first back-reference, ([a-z0-9]+) this would be second. If you for example wanted to have this regex:

([a-z0-9.]*)([a-z0-9.]*)

You could use back-reference to that bracket:

([a-z0-9.]*)\1

Of course, this be could solved simply with:

([a-z0-9.]*){2}

but in this case{2} has to go right after regex, opposite to back references, after they've being defined, they can be used anywhere. For example:

([a-z0-9.]*) Continuation of regular expression \1

From your question it is not clear in what programming language you are working. However, most programing languages support back-references. What you want to do is create regex like this, and then access first and third back reference, concatenate them and extract text.

Upvotes: 1

Paul Hill
Paul Hill

Reputation: 1

(.*\d*)(\d{4})(.*)

This will allow you to combine 1st and 3rd match.

Online Regular Expressions

Upvotes: 0

Related Questions