Reputation: 612
Using Regex, how do you match everything except four digits in a row? Here is a sample text that I might be using:
foo1234bar
baz 1111bat
asdf 0000 fdsa
a123b
Matches might look something like the following:
"foo", "bar", "baz ", "bat", "asdf ", " fdsa", "a123b"
Here are some regular expressions I've come up with on my own that have failed to capture everything I need:
[^\d]+ (this one includes a123b)
^.*(?=[\d]{4}) (this one does not include the line after the 4 digits)
^.*(?=[\d]{4}).* (this one includes the numbers)
Any ideas on how to get matches before and after a four digit sequence?
Upvotes: 0
Views: 1000
Reputation: 4950
In Python the following is very close to what you want:
In [1]: import re
In [2]: sample = '''foo1234bar
...: baz 1111bat
...: asdf 0000 fdsa
...: a123b'''
In [3]: re.findall(r"([^\d\n]+\d{0,3}[^\d\n]+)", sample)
Out[3]: ['foo', 'bar', 'baz ', 'bat', 'asdf ', ' fdsa', 'a123b']
Upvotes: 0
Reputation: 424993
You haven't specified your app language, but practically every app language has a split function, and you'll get what you want if you split on \d{4}
.
eg in java:
String[] stuffToKeep = input.split("\\d{4}");
Upvotes: 4