JohnZaj
JohnZaj

Reputation: 3230

Regex to find numbers excluding four digit numbers

I am trying to figure out how to find numbers that are not years (I'm defining a year as simply a number that is four digits wide.)

For example, I want to pick up

1

12

123

But NOT 1234 in order to avoid dates (4 digits).

if the regex also picked up 12345 that is fine, but not necessary for solving this problem

(Note: these requirements may seem odd. They are part of a larger solution that I am stuck with)

Upvotes: 1

Views: 1693

Answers (3)

Andrew Clark
Andrew Clark

Reputation: 208435

If lookbehind and lookahead are available, the following should work:

(?<!\d)(\d{1,3}|\d{5,})(?!\d)

Explanation:

(?<!\d)            # Previous character is not a digit
(\d{1,3}|\d{5,})   # Between 1 and 3, or 5 or more digits, place in group 1
(?!\d)             # Next character is not a digit

If you cannot use lookarounds, the following should work:

\b(\d{1,3}|\d{5,})\b

Explanation:

\b                 # Word boundary
(\d{1,3}|\d{5,})   # Between 1 and 3, or 5 or more digits, place in group 1
\b                 # Word boundary

Python example:

>>> regex = re.compile(r'(?<!\d)(\d{1,3}|\d{5,})(?!\d)')
>>> regex.findall('1 22 333 4444 55555 1234 56789')
['1', '22', '333', '55555', '56789']

Upvotes: 5

Mithrandir
Mithrandir

Reputation: 25337

Depending on the regex flavor you use, this might work for you:

(([0-9]{1,3})|([0-9]{5,}))

Upvotes: 0

shift66
shift66

Reputation: 11958

(\\d{0,4} | \\d{6,}) in java.

Upvotes: -1

Related Questions