Reputation: 133
I need a regex match for everything in a string except the last sequence of four digits.
abc12345 => abc1
abc1234abc => abcabc
abc123.45678abc => abc123.4abc
1234abc => abc
12345abc => 1abc
I have tried lots and lots of things. The closest I got is with
.*[^(\d{4})]
but when there are characters behind the last secquence of four this fails.
Upvotes: 1
Views: 7074
Reputation: 1997
The only way you can glue together parts of regular expression is by utilizing power of back-references. You can learn more about back references here.
This code solves your problem and you can see regular expression demo here:
([a-z0-9.]*)\d{4}([a-z0-9]+)
Basically whenever you put something into normal brackets ()
that counts as back-reference, ([a-z0-9.]*)
this would be first back-reference, ([a-z0-9]+)
this would be second. If you for example wanted to have this regex:
([a-z0-9.]*)([a-z0-9.]*)
You could use back-reference to that bracket:
([a-z0-9.]*)\1
Of course, this be could solved simply with:
([a-z0-9.]*){2}
but in this case{2}
has to go right after regex, opposite to back references, after they've being defined, they can be used anywhere. For example:
([a-z0-9.]*) Continuation of regular expression \1
From your question it is not clear in what programming language you are working. However, most programing languages support back-references. What you want to do is create regex like this, and then access first and third back reference, concatenate them and extract text.
Upvotes: 1
Reputation: 1
(.*\d*)(\d{4})(.*)
This will allow you to combine 1st and 3rd match.
Upvotes: 0