Reputation: 42155
I need to validate serial numbers. For this we use regular expressions in C#, and a certain product, part of the serial number is the "seconds since midnight". There are 86400 seconds in a day, but how can I validate it as a 5-digit number in this string?:
654984051-86400-231324
I can't use this concept:
[0-8][0-6][0-4][0-0][0-0]
Because then 86399
wouldn't be valid. How can I overcome this? I want something like:
[00000-86400]
UPDATE
I want to make it clear that I'm aware of - and agree with - the "don't use regular expressions when there's a simpler way" school-of-thought. Jason's answer is exactly how I'd like to do it, however this serial number validation is for all serial numbers that pass through our system - there's currently no custom validation code for these specific ones. In this case I have a good reason for looking for a regex solution.
Of course, if there isn't one, then that makes the case for custom validation for these particular products undeniable, but I wanted to explore this avenue fully before going with a solution that requires code changes.
Upvotes: 6
Views: 5066
Reputation: 180868
Generate a Regular Expression to Match an Arbitrary Numeric Range http://utilitymill.com/utility/Regex_For_Range
yields the following regex expression:
\b0*([0-9]{1,4}|[1-7][0-9]{4}|8[0-5][0-9]{3}|86[0-3][0-9]{2}|86400)\b
Description of output:
First, break into equal length ranges:
0 - 9
10 - 99
100 - 999
1000 - 9999
10000 - 86400
Second, break into ranges that yield simple regexes:
0 - 9
10 - 99
100 - 999
1000 - 9999
10000 - 79999
80000 - 85999
86000 - 86399
86400 - 86400
Turn each range into a regex:
[0-9]
[1-9][0-9]
[1-9][0-9]{2}
[1-9][0-9]{3}
[1-7][0-9]{4}
8[0-5][0-9]{3}
86[0-3][0-9]{2}
86400
Collapse adjacent powers of 10:
[0-9]{1,4}
[1-7][0-9]{4}
8[0-5][0-9]{3}
86[0-3][0-9]{2}
86400
Combining the regexes above yields:
0*([0-9]{1,4}|[1-7][0-9]{4}|8[0-5][0-9]{3}|86[0-3][0-9]{2}|86400)
Tested here: http://osteele.com/tools/rework/
Upvotes: 6
Reputation: 22250
I would use regex combined with some .NET code to accomplish this. A pure regex solution isn't going to be easy or efficient to handle large number ranges.
But this will:
Regex myRegex = new Regex(@"\d{9}-(\d{5})-\d{6}");
String value = myRegex.Replace(@"654984051-86400-231324", "$1");
This will grab the value 86400 in this case. And then you'd just check if the captured number is between 0 and 86400 as per Jason's answer.
Upvotes: 0
Reputation: 52390
If you really need a pure regex solution I believe this would work although the other posters make a good point about only validating they are digits and then using a matching group to validate the actual number.
([0-7][0-9]{4}) | (8[0-5][0-9]{3}) | (86[0-3][0-9]{2}) | (86400)
Upvotes: 0
Reputation: 31300
I don't believe this is possible in regular expressions since this isn't something that can be checked as part of a regular language. In other words, a finite state automata machine cannot recognize this string so a regular expression cannot either.
Edit: This can be recognized by a regex, but not in an elegant way. It would require a monster or chain (e.g.: 00000|00001|00002
or 0{1,5}|0{1,4}1|0{1,4}2
). To me, having to enumerate such a large set of possibilities makes it clear that while it is technically possible, it is not feasible or manageable.
Upvotes: -1
Reputation: 91482
With the standard 'this-is-not-a-particularly-regexy-problem' caveat,
[0-7]\d{4}|8[0-5]\d{3}|86[0-3]\d{2}|86400
Upvotes: 6
Reputation: 993991
You don't want to try to use regular expressions for this, you'll end up with something incomprehensible, unwieldy, and difficult to modify (somebody will probably suggest one :). What you want to do is match the string using a regex to make sure that it contains digits in the format you want, then pull out a matching group and check the range using an arithmetic comparison. For example, in pseudocode:
match regex /(\d+)-(\d+)-(\d+)/
serial = capture group 2
if serial >= 0 and serial <= 86400 then
// serial is valid
end if
Upvotes: 7
Reputation: 241711
Don't use regex? If you're struggling to come up with the regex to parse this that says that maybe it's too complex and you should find something simpler. I see absolutely no benefit to using regex here when a simple
int value;
if(!Int32.TryParse(s, out value)) {
throw new ArgumentException();
}
if(value < 0 || value > 86400) {
throw new ArgumentOutOfRangeException();
}
will work just fine. It's just so clear and easily maintainable.
Upvotes: 10