Reputation: 739
I have a Pandas dataframe with a DateTimeIndex and an empty column called HOLIDAY.
I want to set the value of that column to 'YES' if the datetime in the index is on a holiday, so that the resulting dataframe is like this:
TIME HOLIDAY
2019-11-25 06:00:00 NO
2019-11-26 21:00:00 NO
2019-11-27 18:00:00 NO
2019-11-28 08:00:00 YES
2019-11-29 08:00:00 NO
2019-11-30 08:00:00 NO
I have a list of dates:
holidays = ['2019-07-04', '2019-11-28','2019-12-25']
holidays = pd.to_datetime(holidays)
I tried this, but I get an error:
df.loc[df.index.date.isin(holidays), 'HOLIDAY'] = "YES"
What's the best way to achieve this?
Thank you
Upvotes: 0
Views: 1075
Reputation: 30971
Note that:
holidays
contains a.o. 2019-11-28
at midnight,2019-11-28
, but at 8:00.If you want to find rows with index values in holidays
dates (regardless
of the time part), you have to "nullify" the time part.
One of methods to get the rows in question is to use boolean indexing:
df[df.index.floor('D').isin(holidays)]
The result is:
HOLIDAY
TIME
2019-11-28 08:00:00 YES
You can also get only HOLIDAY
column, running:
df[df.index.floor('D').isin(holidays)].HOLIDAY
This time the result (Series) is:
TIME
2019-11-28 08:00:00 YES
Name: HOLIDAY, dtype: object
Upvotes: 1
Reputation: 153460
DateTimeIndex.date returns a numpy array of strings not pd.Series of pd.Timestamp dtype. So you must get the same dtype on both sides of the equality:
If TIME is not in your index this will work:
m2 = df['TIME'].dt.date.isin(holidays.date)
or
m2 = df.index.to_series().dt.date.isin(holidays.date)
df.loc[m2, 'HOLIDAY'] = "YES"
Output:
HOLIDAY
TIME
2019-11-25 06:00:00 NO
2019-11-26 21:00:00 NO
2019-11-27 18:00:00 NO
2019-11-28 08:00:00 YES
2019-11-29 08:00:00 NO
2019-11-30 08:00:00 NO
Upvotes: 1