TofuBug
TofuBug

Reputation: 603

Regex Full Date validation

(Note: This is not a question about what is the best way with code to do date validation. This is a question about learning more about regular expressions through some trial and error and other people's insight.)

I've been doing a lot of work with Regular Expressions lately (quite frankly I suck at them) I'm learning a lot though and I'm seeking expert opinions on a particular regular expression.

Right now I'm working on migrating a fairly large project to use .NET 4.0 It has a lot of parsing and data manipulation methods across many classes and namespaces... However the majority of not ALL of the parsing and validation has been done with large clunky for loops with a lot of IndexOf() calls.

I've been using quite successfully a combination of Regular Expressions, LINQ, and Extension methods to greatly simplify and clarify the parsing and validation methods.

The trial and error and RegexBuddy has helped tremendously with the learning curve.

Now on to my actual question.

I was working on updating a simple date validation though it is a very VERY loose validation

private static bool isLikeVCardDate(string value_Renamed)
{
  if (value_Renamed == null)
  {
    return true;
  }
  // Not really sure this is true but matches practice
  // Mach YYYYMMDD
  if (isStringOfDigits(value_Renamed, 8))
  {
    return true;
  }
  // or YYYY-MM-DD
  return value_Renamed.Length == 10 && value_Renamed[4] == '-' && value_Renamed[7] == '-' && isSubstringOfDigits(value_Renamed, 0, 4) && isSubstringOfDigits(value_Renamed, 5, 2) && isSubstringOfDigits(value_Renamed, 8, 2);
}

If I want to match that functionality a simple RegEx of

private static bool isLikeVCardDate(string value_Renamed)
{
  return Regex.IsMatch(value_Renamed, @"\d{4}-?\d{2}-?d{2}");
}

would meet the requirements

But it got me thinking How would I go about validating that the date was a completely valid date, leap year, days of the month the whole nine yards

In know there are other post about date validation with regex I'm not interested in someone outright giving me an answer I've got it working I'm wondering if there is any knowledge anyone can impart to me on how to maybe do it better or improve on it.

Mind you I know this is probably not the best example of a practical application of using a regex.

Here is the regex I came up with.

A few notes I'm pasting it in a "tabbed" view just for simplicity of reading, the actual regex has no spaces or new lines.

Additionally everything that is not a Named Capture Group is a non capture group (I left that out to save on space since I just want people's analysis of the regex)

(
 (?<YEAR>((([0][48])|([13579][26])|([2468][048]))00)|(\d{2}(([0][48])|([13579][26])|([2468][048]))))
 -?
 (
  (
   (?<MONTH>(0[13578])|( 1[02]))
   -?
   (?<DAY>(0[1-9])|([12][0-9])|(3[01]))
  )
  |
  (
   (?<MONTH>(0[469])|11)
   -?
   (?<DAY>(0[1-9])|([12][0-9])|30)
  )
  |
  (
   (?<MONTH>02)
   -?
   (?<DAY>(0[1-9])|([12][0-9]))
  )
 )
)
|
(
 (?<YEAR>\d{4})
 -?
 (
  (
   (?<MONTH>(0[13578])|(1[02]))
   -?
   (?<DAY>(0[1-9])|([12][0-9])|(3[01]))
  )
  |
  (
   (?<MONTH>(0[469])|11)
   -?
   (?<DAY>(0[1-9])|([12][0-9])|30)
  )
  |
  (
   (?<MONTH>02)
   -?
   (?<DAY>(0[1-9])|(1[0-9])|(2[0-8]))
  ) 
 )
)

Here is my thought process

  1. Days are relative to the months 4,6,9,11 are 30 days | 1,3,5,7,8,10,12 have 31 and 2 has 28 or 29

  2. Leap years are divisible by four unless it is divisible by 100 then only if also divisible by 400

    1. Based on this and the fact that ANY number is divisible by 4 if the last 2 digits as a number are divisible by 4

    2. Writing out the numbers from 4 - 96 I used the repeating pattern of 0(4,8), {even > 0}(0,4,8) and {odd}(2,6)

    3. Since testing for 400 year leap years gives us the first 2 digits of the year being applicable we can us the same pattern from #2 above

  3. Because of the leap year requirement the regex needs 2 separate captures for dates in a leap year and dates not in a leap year.

Now all of my assumptions could just be wrong and just plain out there but it's what I could come up with how much I understand regex so far

Upvotes: 1

Views: 6248

Answers (1)

Ergwun
Ergwun

Reputation: 12978

I understand that you are doing this as an exercise to learn about regular expressions, so you might enjoy working out how the examples in the answers to these other questions work:

Of course, one of the most important lessons to learn about regular expressions is when NOT to use them. As a result I think you may struggle to get detailed feedback on the example you posted. The take home lesson here is that while some people enjoy writing complex regular expressions, very few enjoy reading (or extending or fixing) them.

Upvotes: 2

Related Questions