Shmelky
Shmelky

Reputation: 83

In pandas, how do I flag a row in a dataframe if a date in one field matches any date within a list of dates?

I have a field with dates where the format is YYYYMMDD however some of the dates have a .0 following the date. See below sample data frame. I would like to determine if the dates in the Entry_Date field exists in the list of dates in holidays. If it does, I would like to have a df['Holiday'] field to say "Holiday" and if it is not in the list, then "Regular" The goal is just to flag dates as holiday versus not. I am running Python 3.7.

I am trying to convert the Entry_Date field to integer and then date type in one line of code if possible (more elegant?).

# Data and holidays
dates = {'Entry_Date': ['20171101.0', '20161201.0','20170103.0','20190101.0','20191225','20190522','20190318']}
holidays = ('2018-12-31','2019-01-01','2019-01-21','2019-02-18','2019-05-27','2019-07-04','2019-09-02','2019-10-14','2019-11-11','2019-11-28','2019-12-25')
df = pd.DataFrame(dates)

# Code to convert to date format. This does not seem to work.
df['date_norm'] = pd.to_numeric(df['Entry_Date'])
df['date_norm2'] = df.date_norm.astype(int)
df['date_norm3'] = pd.to_datetime(df['date_norm2']) 

# Match
df['Holiday'] = df['Entry_Date'].apply(lambda x: 'Holiday' if x == holidays else 'Regular')

Thank you

Upvotes: 0

Views: 1188

Answers (1)

Tim
Tim

Reputation: 3427

Two ways to mark Holiday vs Regular:

First,

df['Holiday'] = df['date_norm3'].apply(lambda x: 'Holiday' if x.strftime('%Y-%m-%d') in holidays else 'Regular')

Second,

df['Holiday'] = 'Regular'
df.loc[df['date_norm3'].dt.strftime('%Y-%m-%d').isin(holidays), 'Holiday'] = 'Holiday'

To convert to datetime, you just need 1 step:

df['date_norm3'] = pd.to_datetime(df['Entry_Date'].str[:10], format='%Y%m%d') 

Upvotes: 1

Related Questions