Reputation: 83
I have a field with dates where the format is YYYYMMDD however some of the dates have a .0 following the date. See below sample data frame. I would like to determine if the dates in the Entry_Date field exists in the list of dates in holidays. If it does, I would like to have a df['Holiday'] field to say "Holiday" and if it is not in the list, then "Regular" The goal is just to flag dates as holiday versus not. I am running Python 3.7.
I am trying to convert the Entry_Date field to integer and then date type in one line of code if possible (more elegant?).
# Data and holidays
dates = {'Entry_Date': ['20171101.0', '20161201.0','20170103.0','20190101.0','20191225','20190522','20190318']}
holidays = ('2018-12-31','2019-01-01','2019-01-21','2019-02-18','2019-05-27','2019-07-04','2019-09-02','2019-10-14','2019-11-11','2019-11-28','2019-12-25')
df = pd.DataFrame(dates)
# Code to convert to date format. This does not seem to work.
df['date_norm'] = pd.to_numeric(df['Entry_Date'])
df['date_norm2'] = df.date_norm.astype(int)
df['date_norm3'] = pd.to_datetime(df['date_norm2'])
# Match
df['Holiday'] = df['Entry_Date'].apply(lambda x: 'Holiday' if x == holidays else 'Regular')
Thank you
Upvotes: 0
Views: 1188
Reputation: 3427
Two ways to mark Holiday
vs Regular
:
First,
df['Holiday'] = df['date_norm3'].apply(lambda x: 'Holiday' if x.strftime('%Y-%m-%d') in holidays else 'Regular')
Second,
df['Holiday'] = 'Regular'
df.loc[df['date_norm3'].dt.strftime('%Y-%m-%d').isin(holidays), 'Holiday'] = 'Holiday'
To convert to datetime, you just need 1 step:
df['date_norm3'] = pd.to_datetime(df['Entry_Date'].str[:10], format='%Y%m%d')
Upvotes: 1