Marc
Marc

Reputation: 2859

Regex - Find numbers between 2000 and 3000

I have a need to search all numbers with 4 digits between 2000 and 3000.

It can be that letters are before and after.

I thought I can use [2000-3000]{4}, but doesnt work, why?

thank you.

Upvotes: 11

Views: 6699

Answers (6)

rui
rui

Reputation: 11284

Hum tricky one. The dash - only applies to the character immediately before and after so what your regex is actually matching is exactly 4 characters between 0 and 3 inclusive (ie, 0, 1, 2 and 3). eg, 3210, 1230, 3333, etc... Try the expression below.

(2[0-9]{3})|(3000)

Upvotes: 3

YOU
YOU

Reputation: 123937

How about

^2\d{3}|3000$

Or as Amarghosh & Bart K. & jleedev pointed out, to match multiple instances

\b(?:2[0-9]{3}|3000)\b

If you need to match a3000 or 3000a but not 13000, you would need lookahead and lookbefore like

(?<![0-9])(?:2[0-9]{3}|3000)(?![0-9])

Upvotes: 25

paxdiablo
paxdiablo

Reputation: 882776

Regular expressions are rarely suitable for checking ranges since for ranges like 27 through 9076 inclusive, they become incredibly ugly. It can be done but you're really better off just doing a regex to check for numerics, something like:

^[0-9]+$

which should work on just about every regex engine, and then check the range manually.

In toto:

def isBetween2kAnd3k(s):
    if not s.match ("^[0-9]+$"):
        return false
    i = s.toInt()
    if i < 2000 or i > 3000:
        return false
    return true

What your particular regex [2000-3000]{4} is checking for is exactly four occurrences of any of the following character: 2,0,0,0-3,0,0,0 - in other words, exactly four digits drawn from 0-3.

With letters before an after, you will need to modify the regex and check the correct substring, something like:

def isBetween2kAnd3kWithLetters(s):
    if not s.match ("^[A-Za-z]*[0-9]{4}[A-Za-z]*$"):
        return false
    idx = s.locate ("[0-9]")
    i = s.substring(idx,4).toInt()
    if i < 2000 or i > 3000:
        return false
    return true

As an aside, a regex for checking the range 27 through 9076 inclusive would be something like this hideous monstrosity:

^2[7-9]|[3-9][9-9]|[1-9][0-9]{2}|[1-8][0-9]{3}|90[0-6][0-9]|907[0-6]$

I think that's substantially less readable than using ^[1-9][0-9]+$ then checking if it's between 27 and 9076 with an if statement?

Upvotes: 9

Rorick
Rorick

Reputation: 8953

Correct regex will be \b(2\d{3}|3000)\b. That means: match character '2' then exactly three digits (this will match any from 2000 to 2999) or just match '3000'. There are some good tutorials on regular expressions:

  1. http://gnosis.cx/publish/programming/regular_expressions.html
  2. http://immike.net/blog/2007/04/06/the-absolute-bare-minimum-every-programmer-should-know-about-regular-expressions/
  3. http://www.regular-expressions.info/

Upvotes: 2

ghostdog74
ghostdog74

Reputation: 343201

why don't you check for greater or less than? its simpler than a regex

num >= 2000 and num <=3000 

Upvotes: 1

queen3
queen3

Reputation: 15521

Here's explanation why and ways to detect ranges: http://www.regular-expressions.info/numericranges.html

Upvotes: 2

Related Questions