Tibi
Tibi

Reputation: 3875

Efficient algorithm for converting number of days to years (including leap years)

The problem

I am writing a class for holding dates in c++, and I found the following problem:

I have a number of days N since a reference date (in my case that would be Jan 1, 0001 AD), including the leap days that passed since the reference day. How can I convert this number to a year Y, month M and day D efficiently?

I would like to do this as efficiently as possible, so the best implementation would obviously have O(1) complexity.

The next sections will explain some of the things I already learned.

Leap years

To determine if a year is leap or not, there are a few rules:

  1. Years which are divisible by 4 are leap
  2. Exception to rule 1: years that are divisible with 100 are not leap
  3. Exception to rule 2: years that are divisible with 400 are leap

This would translate in code like this:

bool IsLeapYear(int year)
{
    // Corrected after Henrick's suggestion
    if (year % 400 == 0) return true;
    if ((year % 4 == 0) && (year % 100 != 0)) return true;
    return false;
}

An efficient method to calculate how many years are leap before an year would be:

int LeapDaysBefore(int year)
{
    // Years divisible by 4, not divisible by 100, but divisible by 400
    return ((year-1)/4 - (year-1)/100 + (year-1)/400);
}

Calculating the month

Once I find the year, I can calculate how many days there are until the current year, and I can subtract this number from N. This will give me the day of the year.

Keeping a table with the day number on which every month starts, we can easily calculate the month. I also created a function which will add 1 if the year is leap, and the month is greater or equal to 2.

// What day each month starts on (counting from 0)
int MonthDaySt[] = { 0, 31, 59, 90, 120, 151, 181, 212, 
    243, 273, 304, 334, 365 };

int MonthDayStart(int month, bool leap)
{
   if (leap && month >= 2) return MonthDaySt[month]+1;
   return MonthDaySt[month];
}

My idea

My algorithm is pretty complicated, and it looks like this:

void GetDate(int N, int &Y, int &M, int &D)
{
    int year_days;

    // Approximate the year, this will give an year greater or equal
    // to what year we are looking for.
    Y = N / 365 + 1;

    // Find the actual year, counting leap days
    do {
        Y--;

        // Calculate the actual number of days until the
        // approximate year
        year_days = Y * 365 + LeapDaysBefore(year);

    } while (year_days > N);

    // Add 1, because we start from year 1 AD (not 0)
    Y++;

    // Calculate month
    uint64_t diff = N - year_days; // Will give us the day of the year
    bool leap = IsLeapYear(Y);  // Is current year leap?

    // Use table to find month
    M = 0;
    while (MonthDayStart(M, leap) <= diff && M <= 12)
        M++;

    // Calculate day
    D = diff - MonthDayStart(M - 1, leap) + 1;
}

The function may have a few bugs (for example, it didn't work when N was 0).

Other notes

I hope that my algorithm is still correct, because I made some changes from the original for this question. If I missed something, or something was wrong, let me know to modify it. And sorry for the long question.

Upvotes: 14

Views: 15753

Answers (11)

ThibaultDECO
ThibaultDECO

Reputation: 63

This is how I do it:

class DatePoint {
public:
    double seconds;
    uint8_t minutes;
    uint8_t hours;
    uint8_t weekday;
    uint8_t day;
    uint8_t month;
    uint16_t year;
    DatePoint(double SecondsSinceEpoch);
    double GetSecondsSinceEpoch();
};

DatePoint::DatePoint(double SecondsSinceEpoch) {

    double DaysSinceEpoch = std::floor(SecondsSinceEpoch / 86400.0);
    double SecondsOfDay = SecondsSinceEpoch - (DaysSinceEpoch * 86400.0);
    double MinutesOfDay = std::floor(SecondsOfDay / 60.0);
    double HoursOfDay = std::floor(MinutesOfDay / 60.0);
    int DaysSinceEpoch_int = static_cast<int>(DaysSinceEpoch);
    DaysSinceEpoch_int = DaysSinceEpoch_int + (97 + (400 * 365)) * 5 - 11017; // 11,017 days between 01/01/1970 and 01/03/2000
    
    int Quotient = (DaysSinceEpoch_int / (97 + (400 * 365))) * 400;
    int Mod = DaysSinceEpoch_int - (97 + (400 * 365)) * (Quotient / 400);
    Quotient = Quotient + (Mod / (24 + (100 * 365)) - Mod / ((24 + (100 * 365)) * 4)) * 100;
    Mod = Mod - (24 + (100 * 365)) * (Mod / (24 + (100 * 365)) - Mod / ((24 + (100 * 365)) * 4));
    Quotient = Quotient + (Mod / (1 + (4 * 365))) * 4;
    Mod = Mod - (1 + (4 * 365)) * (Mod / (1 + (4 * 365)));
    Quotient = Quotient + Mod / 365 - Mod / (4 * 365);
    Mod = Mod - (Mod / 365 - Mod / (4 * 365)) * 365;
    Quotient = Quotient + Mod / 306;

    seconds = SecondsOfDay - (MinutesOfDay * 60.0);
    minutes = static_cast<uint8_t>(MinutesOfDay - (HoursOfDay * 60.0));
    hours = static_cast<uint8_t>(HoursOfDay);
    weekday = 1 + static_cast<uint8_t>((DaysSinceEpoch_int + 2) % 7);
    day = static_cast<uint8_t>(Mod - ((((100 * Mod + 52) / 3060) * 306 + 5) / 10) + 1);
    month = static_cast<uint8_t>(((((100 * Mod + 52) / 3060) + 2) % 12) + 1);
    year = static_cast<uint16_t>(Quotient);

};

double DatePoint::GetSecondsSinceEpoch() {
    double SecondsSinceEpoch = seconds + minutes * 60.0 + hours * 3600.0;
    int StdMonth = (static_cast<int>(month) + 9) % 12;
    int StdYear = static_cast<int>(year) - (StdMonth / 10);
    int Days = StdYear * 365 + (StdYear/4) - (StdYear/100) + (StdYear/400) + ((StdMonth*306 +5)/10) + static_cast<int>(day) - 1;
    Days = Days - (97 + (400 * 365)) * 5 + 11017; // 11,017 days between 01/01/1970 and 01/03/2000
    SecondsSinceEpoch = SecondsSinceEpoch + Days * 86400.0;
    return SecondsSinceEpoch;
};

Upvotes: 0

dembones
dembones

Reputation: 341

I made a number of failed attempts at solving Gregorian date problems over the years. I developed this code about 15 years ago, and it continues to perform well. Because I wrote versions of this code so long ago, it's in native C, but is easily compiled into C++ programs. Feel free to wrap these in a Date class, if you like.

My code is based on combining all the leap year rules into a 400-year cycle. Under Gregorian leap year rules, every 400-year cycle has exactly 146,097 days.

An optimization I employed is to move January and February to the end of the prior year. This makes the leap day (if present) always fall on the last day of the year. That allows me to build a table (dayOffset) which provides the distance in days from March 1. Because the leap day would fall at the end, this table is accurate for leap- and non-leap-years.

I'll begin with the header file.

#if !defined( TIMECODE_H_ )
#define TIMECODE_H_ 1

#if defined(__cplusplus)
extern "C" {
#endif

int dateCode( int month, int dayOfMonth, int year );

void decodeDate( int *monthPtr, int *dayOfMonthPtr, int *yearPtr, int dateCode );

int dayOfWeek( int dateCode );

int cardinalCode( int nth, int weekday, int month, int year );

enum Weekdays { eMonday, eTuesday, eWednesday, eThursday, eFriday, eSaturday, eSunday };

#if defined(__cplusplus)
}
#endif

#endif

The API consists of four methods: dateCode() calculates the date code for a Gregorian date. decodeDate() calculates the Gregorian month, day and year from a date code. dayOfWeek() returns the day of the week for a date code. cardinalCode() returns the date code for a "cardinal" day of a specific month (for example, the 2nd Wednesday of August 2014).

Here's the implementation:

#include <math.h>

enum
{
   nbrOfDaysPer400Years = 146097,
   nbrOfDaysPer100Years = 36524,
   nbrOfDaysPer4Years = 1461,
   nbrOfDaysPerYear = 365,
   unixEpochBeginsOnDay = 135080
};

const int dayOffset[] =
{
   0, 31, 61, 92, 122, 153, 184, 214, 245, 275, 306, 337, 366
};

/* ------------------------------------------------------------------------------------ */
int mod( int dividend, int divisor, int* quotientPtr )
{
   *quotientPtr = (int)floor( (double)dividend / divisor );
   return dividend - divisor * *quotientPtr;
}

/* ------------------------------------------------------------------------------------ */
int dateCode( int month, int dayOfMonth, int year )
{
   int days;
   int temp;
   int bYday;
   /*
   we take the approach of starting the year on March 1 so that leap days fall
   at the end. To do this we pretend Jan. - Feb. are part of the previous year.
   */
   int bYear = year - 1600;
   bYday = dayOffset[ mod( month - 3, 12, &temp ) ] + dayOfMonth - 1;
   bYear += temp;

   bYear = mod( bYear, 400, &days );
   days *= nbrOfDaysPer400Years;

   bYear = mod( bYear, 100, &temp );
   days += nbrOfDaysPer100Years * temp;

   bYear = mod( bYear, 4, &temp );
   days += nbrOfDaysPer4Years * temp + nbrOfDaysPerYear * bYear + bYday -
      unixEpochBeginsOnDay;

   return days;
}

/* ------------------------------------------------------------------------------------ */
int dayOfWeek( int dateCode )
{
   int temp;
   return mod( dateCode + 3, 7, &temp );
}

/* ------------------------------------------------------------------------------------ */
void decodeDate( int *monthPtr, int *dayOfMonthPtr, int *yearPtr, int dateCode )
{
   int diff;
   int diff2;
   int alpha;
   int beta;
   int gamma;
   int year;
   int temp;

   /* dateCode has the number of days relative to 1/1/1970, shift this back to 3/1/1600 */
   dateCode += unixEpochBeginsOnDay;
   dateCode = mod( dateCode, nbrOfDaysPer400Years, &temp );
   year = 400 * temp;
   dateCode = mod( dateCode, nbrOfDaysPer100Years, &temp );
   /* put the leap day at the end of 400-year cycle */
   if ( temp == 4 )
   {
      --temp;
      dateCode += nbrOfDaysPer100Years;
   }
   year += 100 * temp;
   dateCode = mod( dateCode, nbrOfDaysPer4Years, &temp );
   year += 4 * temp;
   dateCode = mod( dateCode, nbrOfDaysPerYear, &temp );
   /* put the leap day at the end of 4-year cycle */
   if ( temp == 4 )
   {
      --temp;
      dateCode += nbrOfDaysPerYear;
   }
   year += temp;

   /* find the month in the table */
   alpha = 0;
   beta = 11;
   gamma = 0;
   for(;;)
   {
      gamma = ( alpha + beta ) / 2;
      diff = dayOffset[ gamma ] - dateCode;
      if ( diff < 0 )
      {
         diff2 = dayOffset[ gamma + 1 ] - dateCode;
         if ( diff2 < 0 )
         {
            alpha = gamma + 1;
         }
         else if ( diff2 == 0 )
         {
            ++gamma;
            break;
         }
         else
         {
            break;
         }
      }
      else if ( diff == 0 )
      {
         break;
      }
      else
      {
         beta = gamma;
      }
   }

   if ( gamma >= 10 )
   {
      ++year;
   }
   *yearPtr = year + 1600;
   *monthPtr = ( ( gamma + 2 ) % 12 ) + 1;
   *dayOfMonthPtr = dateCode - dayOffset[ gamma ] + 1;
}

/* ------------------------------------------------------------------------------------ */
int cardinalCode( int nth, int weekday, int month, int year )
{
   int dow1st;
   int dc = dateCode( month, 1, year );
   dow1st = dayOfWeek( dc );
   if ( weekday < dow1st )
   {
      weekday += 7;
   }
   if ( nth < 0 || nth > 4 )
   {
      nth = 4;
   }
   dc += weekday - dow1st + 7 * nth;
   if ( nth == 4 )
   {
      /* check that the fifth week is actually in the same month */
      int tMonth, tDayOfMonth, tYear;
      decodeDate( &tMonth, &tDayOfMonth, &tYear, dc );
      if ( tMonth != month )
      {
         dc -= 7;
      }
   }
   return dc;
}

One issue with efficiency that will be immediately apparent is the mod() function. As you might expect, it provides the quotient and remainder of two integral dividends. C/C++ provides the modulus operator (%) which would seem to be a better choice; however, the standards don't specify how this operation should handle negative dividends. (See here for more information).

There is probably a portable solution which uses efficient integer math; however, I've opted here for one that is slightly less efficient, but guaranteed correct on all platforms.

A date code is simply an offset in days from a base date. I chose 1600-March-01 because it's the start of a 400-year Gregorian cycle that is early enough so that all the dates we are likely to encounter will result in a date code that is a positive integer. However, there's nothing incorrect about date codes before the base date. Since we're using a stable/portable modulo operation, all the math works well for negative date codes.

Some don't like my non-standard base date, so I decided to adopt the standard Unix epoch, which begins 1970-January-01. I defined unixEpochBeginsOnDay to bias the date code to start on the desired date. If you want to use a different base date, you would replace this value with one you prefer.

Calculating a date code is as simple as passing the month, dayOfMonth and year to dateCode():

int dc = dateCode( 2, 21, 2001 );  // returns date code for 2001-Feb-21

I've written dateCode so that it will accept values that are out of range for month and dayOfMonth. You can think of month as one plus the integer number of months after January of the given year. Here's a few tests to demonstrate:

assert(dateCode( 14, 1, 2000 ) == dateCode( 2, 1, 2001 ));
assert(dateCode( 5, 32, 2005 ) == dateCode( 6, 1, 2005 ));
assert(dateCode( 0,  1, 2014 ) == dateCode(12, 1, 2013 ));

Calling dateCode with non-canoncial month and dayOfMonth values, then converting back with decodeDate, is an effective way to canonicalize dates. For example:

int m, d, y;
decodeDate( &m, &d, &y, dateCode( 8, 20 + 90, 2014 ));
printf("90 days after 2014-08-20 is %4d-%02d-%02d\n", y, m, d);

The output should be:

90 days after 2014-08-20 is 2014-11-18

decodeDate() always produces canonical values for month and dayOfMonth.

dayOfWeek() simply returns the modulus 7 of the dateCode, but I had to bias dateCode by 3 since 1970-January-01 was Thursday. If you prefer to start your week on a different day than Monday, then fix the Weekdays enum and change the bias as desired.

cardinalCode() provides an interesting application of these methods. The first parameter provides the week number of the month ("nth"), and the second parameter provides the weekday. So to find the fourth Saturday in August 2007, you would:

int m, d, y;
decodeDate( &m, &d, &y, cardinalCode( 3, eSaturday, 8, 2007 ) );
printf( "%d/%02d/%d\n", m, d, y );

Which produces the answer:

8/25/2007

Note that the nth parameter, 3, in the example above specifies the fourth Saturday. I debated whether this parameter should be zero-based or one-based. For whatever reason, I settled on: 0=first, 1=second, 2=third, etc. Even the shortest months have four occurrences of every weekday. The value 4 has a special meaning. One would expect it to return the fifth occurrence of the requested weekday; however, since the month may or may not have a fifth occurrence, I decided to return the last occurrence of the month.

For example, to display the last Monday of each month next year:

int i, m, d, y;
for (i=1; i <= 12; ++i) {
    decodeDate( &m, &d, &y, cardinalCode( 4, eMonday, i, 2015 ) );
    printf( "%d/%02d/%d\n", m, d, y );
}

One final example, illustrating one use for cardinalCode(), displaying the number of days until the next general election:

#include <stdio.h>
#include <time.h> /* only needed for time() and localtime() calls */
#include "datecode.h"

void main()
{
   int eYear, eday, dc;
   int eY, eM, eD;
   time_t now;
   struct tm bdtm;

   time(&now);
   if (localtime_r(&now, &bdtm) == NULL) {
       printf("Error\n");
       return 1;
   }
   eYear = bdtm.tm_year + 1900;
   dc = dateCode(bdtm.tm_mon + 1, bdtm.tm_mday, eYear);
   if ((eYear % 2) != 0) {
       ++eYear;
   }
   for(;;) {
       eday = cardinalCode(0, eTuesday, 11, eYear);
       if (eday >= dc) break;
       eYear += 2;    /* move to the next election! */
   }
   decodeDate(&eM, &eD, &eY, eday);
   printf("Today is %d/%02d/%d\neday is %d/%02d/%d, %d days from today.\n",
           bdtm.tm_mon + 1, bdtm.tm_mday, bdtm.tm_year + 1900,
           eM, eD, eY, eday - dc);
}

Upvotes: 2

dviljoen
dviljoen

Reputation: 1632

bool IsLeapYear(int year)
{
    boost::gregorian::date d1(year, 1, 1);
    boost::gregorian::date d2 = d1 + boost::gregorian::years(1);
    boost::gregorian::date_duration diff = d2 - d1;
    return diff.days() != 365;
}

Upvotes: 0

Keldon Alleyne
Keldon Alleyne

Reputation: 2130

Here are a few pointers. Note: For this exercise I will assume that when N=0 that Y % 400 == 0.

1: There are a fixed number of days in each 400 year period (400 * 365) + 100 + 1 - 4.

The +100 is for the leap years, the +1 is for the leap year every 400 years and the -4 is for not having a leap year every 100 years.

So your first line of code will be:

GetDate(int N, int &Y, int &M, int &D) {
  const int DAYS_IN_400_YEARS = (400*365)+97;
  int year = (N / DAYS_IN_400_YEARS) * 400;
  N = N % DAYS_IN_400_YEARS;

2: You can make your life a great deal easier if you treat March 1st as the first day of the year

3: Adding to the code in (1), we can work out the year. Bear in mind that every fourth century begins with a leap year. So you can complete the calculation of the year with the following:

  const int DAYS_IN_100_YEARS = (100*365) + 24;
  year += 100 * (N / DAYS_IN_100_YEARS) + (N < DAYS_IN_100_YEARS ? 1 : 0); // Add an extra day for the first leap year that occurs every 400 years.
  N = N - (N < DAYS_IN_100_YEARS ? 1 : 0);
  N = N % DAYS_IN_400_YEARS;

4: Now you've sorted out the years, the rest is easy as pie (just remember (2), and the process is easy).

Alternatively you could use boost::date.

Upvotes: 5

Bill Door
Bill Door

Reputation: 18926

Why are you reinventing dates?

Date math is well understood. The standard C library (that's right, C, not just C++) has had date functions for many years.

As other's have indicated the boost date classes are also well designed and easy to use.

When searching for an answer, the first question should be, is the problem already solved. This problem has been solved for many years.

Upvotes: -4

DevSolar
DevSolar

Reputation: 70253

I have a number of days N since a reference date (in my case that would be Jan 1, 0001 AD)...

In that case, "efficiency" in applying the 4-100-400 rule and looking up month lengths is not your main problem. Please be also aware of the multiple problems inherent in applying today's Gregorian calendar to dates predating its inception, and the fact that Gregorian was not introduced uniformly. (*)

Wikipedia is a good starting point to a very involved subject.

(*): Depending on country, anywhere between 15 October 1582 and 15 February 1923, resp. not at all, actually.

Upvotes: 1

Nuno Aniceto
Nuno Aniceto

Reputation: 1982

Let me simplify the question, I won't consider exceptions for explanation. Every 4 years, a leap occur, if you have 365*5 days, there must be a leap-year (unless if the exception 2 is applied). You could just use division for having the number of leap-years (if ignoring the exceptions).

Then you can easily use division and remainder for having the non-leap years/months/days.

Use the same basic intuition for resolving the Exception 1 (if the number of years is a multiple of 100, then also check the Exception 2)

  1. Years which are divisible by 4 are leap
  2. Exception to rule 1: years that are divisible with 100 are not leap
  3. Exception to rule 2: years that are divisible with 400 are leap

Upvotes: 1

podiluska
podiluska

Reputation: 51494

To use the punchline of an old joke, "I wouldn't start from here".

You'll want to read up about various changes to calendaring before "modern" times, for example, what happened in 1752

Upvotes: 5

Timbo
Timbo

Reputation: 28050

To speed up the calculation of the year, you could build a lookup table

int[] YearStartDays =
{
    0,                     // 1 AD
    365,                   // 2 AD
    365 + 365,             // 3 AD
    365 + 365 + 365,       // 4 AD
    365 + 365 + 365 + 366, // 5 AD (4 was a leap year)
    /* ... */
};

You can then do a binary search in this array, which is O(log N) instead of the O(N) of your current year finding algorithm.

Upvotes: 0

SingerOfTheFall
SingerOfTheFall

Reputation: 29966

Obviously, the bottleneck is the year calculation. I would suggest you doing this. When you initialize the calendar, approximate the year (very rougly) by dividing the days by 365. After that, pre-form a list of all leap years before this estimation. It should be rather fast since you don't need to count all of them, just add 4 years each time. Also, while doing them, count how much of such you have. Actually, you could even count them in larger packs (i.e. there are 100 leap years every 400 years), but you will need to check for the the leap year exceptions carefully, not to skip some of them.

At the end of this, you will have the rough estimate of the year, and the amount of all leap years before it. Now you can count the precise year very easilly, without needing to iterate through anything:

leapYearCount * 366 + (lastCalculatedYear - leapYearCount) * 365

Upvotes: 1

Henrik
Henrik

Reputation: 23324

This

bool IsLeapYear(int year) 
{ 
    if ((year % 4 == 0) && (year % 100 != 0) && (year % 400 == 0)) return true; 
    else return false; 
}

is incorrect. It returns false for 2000. Better:

bool IsLeapYear(int year) 
{ 
    if (year % 400 == 0) return true; 
    if ((year % 4 == 0) && (year % 100 != 0)) return true; 
    return false; 
}

Upvotes: 1

Related Questions