user1911092
user1911092

Reputation: 4241

Dealing with leap years and substituting a new date

I have a list of dates in a dataframe (df) in python under the column name DATE:

0       1998-03-31
1       1998-06-30
2       1998-09-30
3       1998-12-31
4       1999-03-31
5       1999-06-30
6       1999-09-30
7       1999-12-31
8       2000-02-29
9       2000-06-30
10      2000-09-30
11      2000-12-31
12      2001-03-31
13      2001-06-30
14      2001-09-30
Name: DATE, dtype: datetime64[ns]

I want to turn all of the leap year dates XXXX-02-29 into XXXX-02-28. What is the most efficient way to do this? Thank you.

Upvotes: 2

Views: 1112

Answers (4)

B. M.
B. M.

Reputation: 18668

Since only leap years have 29-2 :

def _292(date): return (date.month==2) & (date.day==29)
df['DATE'][df['DATE'].apply(_292)]-=pd.Timedelta('1D') # yesterday

Upvotes: 0

tmthydvnprt
tmthydvnprt

Reputation: 10758

This can be done with pd.datetime and a lambda apply:

import pandas as pd

# Make DataFrame
df = pd.DataFrame(
    pd.date_range('1998-02-28', periods=12, freq='6M'), 
    columns=['Date']
)
print 'Original DataFrame:'
print df
print

# Replace feb 29 with feb 28
df['Date'] = df['Date'].apply(
    lambda x: 
        x if x.month != 2 and x.date != 29 
        else pd.datetime(x.year, x.month, 28)
)

print 'Processed DataFrame:'
print df
print

Original DataFrame:
         Date
0  1998-02-28
1  1998-08-31
2  1999-02-28
3  1999-08-31
4  2000-02-29
5  2000-08-31
6  2001-02-28
7  2001-08-31
8  2002-02-28
9  2002-08-31
10 2003-02-28
11 2003-08-31

Processed DataFrame:
         Date
0  1998-02-28
1  1998-08-31
2  1999-02-28
3  1999-08-31
4  2000-02-28
5  2000-08-31
6  2001-02-28
7  2001-08-31
8  2002-02-28
9  2002-08-31
10 2003-02-28
11 2003-08-31

Upvotes: 1

jezrael
jezrael

Reputation: 863246

You can try where with mask and map:

import pandas as pd
import datetime as datetime

def is_leap_and_29Feb(s):
    return (s.dt.year % 4 == 0) & 
           ((s.dt.year % 100 != 0) | (s.dt.year % 400 == 0)) & 
           (s.dt.month == 2) & (s.dt.day == 29)

mask = is_leap_and_29Feb(df.DATE)
print mask  
0     False
1     False
2     False
3     False
4     False
5     False
6     False
7     False
8      True
9     False
10    False
11    False
12    False
13    False
14    False
Name: DATE, dtype: bool
df['DATE'] = df.DATE.mask(mask,df.DATE.map(lambda x: datetime.datetime(x.year, x.month, 28)))
print df
        DATE
0  1998-03-31
1  1998-06-30
2  1998-09-30
3  1998-12-31
4  1999-03-31
5  1999-06-30
6  1999-09-30
7  1999-12-31
8  2000-02-28
9  2000-06-30
10 2000-09-30
11 2000-12-31
12 2001-03-31
13 2001-06-30
14 2001-09-30

Upvotes: 0

DSchmidt
DSchmidt

Reputation: 1127

You could check if the year is a leap year and then check if there is a 02-29 day.

if year % 4 == 0 and year % 100 != 0 or year % 400 == 0:
    # day/month check

Upvotes: 1

Related Questions