Neil Barnwell
Neil Barnwell

Reputation: 42155

Regular expression where part of string must be number between 0-100

I need to validate serial numbers. For this we use regular expressions in C#, and a certain product, part of the serial number is the "seconds since midnight". There are 86400 seconds in a day, but how can I validate it as a 5-digit number in this string?:

654984051-86400-231324

I can't use this concept:

[0-8][0-6][0-4][0-0][0-0]

Because then 86399 wouldn't be valid. How can I overcome this? I want something like:

[00000-86400]

UPDATE
I want to make it clear that I'm aware of - and agree with - the "don't use regular expressions when there's a simpler way" school-of-thought. Jason's answer is exactly how I'd like to do it, however this serial number validation is for all serial numbers that pass through our system - there's currently no custom validation code for these specific ones. In this case I have a good reason for looking for a regex solution.

Of course, if there isn't one, then that makes the case for custom validation for these particular products undeniable, but I wanted to explore this avenue fully before going with a solution that requires code changes.

Upvotes: 6

Views: 5066

Answers (7)

Robert Harvey
Robert Harvey

Reputation: 180868

Generate a Regular Expression to Match an Arbitrary Numeric Range http://utilitymill.com/utility/Regex_For_Range

yields the following regex expression:

\b0*([0-9]{1,4}|[1-7][0-9]{4}|8[0-5][0-9]{3}|86[0-3][0-9]{2}|86400)\b

Description of output:

First, break into equal length ranges:
  0 - 9
  10 - 99
  100 - 999
  1000 - 9999
  10000 - 86400

Second, break into ranges that yield simple regexes:
  0 - 9
  10 - 99
  100 - 999
  1000 - 9999
  10000 - 79999
  80000 - 85999
  86000 - 86399
  86400 - 86400

Turn each range into a regex:
  [0-9]
  [1-9][0-9]
  [1-9][0-9]{2}
  [1-9][0-9]{3}
  [1-7][0-9]{4}
  8[0-5][0-9]{3}
  86[0-3][0-9]{2}
  86400

Collapse adjacent powers of 10:
  [0-9]{1,4}
  [1-7][0-9]{4}
  8[0-5][0-9]{3}
  86[0-3][0-9]{2}
  86400

Combining the regexes above yields:
  0*([0-9]{1,4}|[1-7][0-9]{4}|8[0-5][0-9]{3}|86[0-3][0-9]{2}|86400)

Tested here: http://osteele.com/tools/rework/

Upvotes: 6

Steve Wortham
Steve Wortham

Reputation: 22250

I would use regex combined with some .NET code to accomplish this. A pure regex solution isn't going to be easy or efficient to handle large number ranges.

But this will:

Regex myRegex = new Regex(@"\d{9}-(\d{5})-\d{6}");
String value = myRegex.Replace(@"654984051-86400-231324", "$1");

This will grab the value 86400 in this case. And then you'd just check if the captured number is between 0 and 86400 as per Jason's answer.

Upvotes: 0

Taylor Leese
Taylor Leese

Reputation: 52390

If you really need a pure regex solution I believe this would work although the other posters make a good point about only validating they are digits and then using a matching group to validate the actual number.

([0-7][0-9]{4}) | (8[0-5][0-9]{3}) | (86[0-3][0-9]{2}) | (86400)

Upvotes: 0

Justin Johnson
Justin Johnson

Reputation: 31300

I don't believe this is possible in regular expressions since this isn't something that can be checked as part of a regular language. In other words, a finite state automata machine cannot recognize this string so a regular expression cannot either.

Edit: This can be recognized by a regex, but not in an elegant way. It would require a monster or chain (e.g.: 00000|00001|00002 or 0{1,5}|0{1,4}1|0{1,4}2). To me, having to enumerate such a large set of possibilities makes it clear that while it is technically possible, it is not feasible or manageable.

Upvotes: -1

Jimmy
Jimmy

Reputation: 91482

With the standard 'this-is-not-a-particularly-regexy-problem' caveat,

[0-7]\d{4}|8[0-5]\d{3}|86[0-3]\d{2}|86400 

Upvotes: 6

Greg Hewgill
Greg Hewgill

Reputation: 993991

You don't want to try to use regular expressions for this, you'll end up with something incomprehensible, unwieldy, and difficult to modify (somebody will probably suggest one :). What you want to do is match the string using a regex to make sure that it contains digits in the format you want, then pull out a matching group and check the range using an arithmetic comparison. For example, in pseudocode:

match regex /(\d+)-(\d+)-(\d+)/
serial = capture group 2
if serial >= 0 and serial <= 86400 then
    // serial is valid
end if

Upvotes: 7

jason
jason

Reputation: 241711

Don't use regex? If you're struggling to come up with the regex to parse this that says that maybe it's too complex and you should find something simpler. I see absolutely no benefit to using regex here when a simple

int value;
if(!Int32.TryParse(s, out value)) {
    throw new ArgumentException();
}
if(value < 0 || value > 86400) {
    throw new ArgumentOutOfRangeException();
}

will work just fine. It's just so clear and easily maintainable.

Upvotes: 10

Related Questions