Reputation: 253
I have a dataframe with a double index (day, time) and would like to create a new column 'Holiday' equal to one if the index day belongs to a list of holidays.
My list of holidays of type DatetimeIndex:
holidays = ['2017-09-11', '2017-12-24']
My original dataframe:
Visitor
Date Time
2017-09-11 4:45 0
5:00 1
5:15 26
....
2017-09-12 4:45 0
5:00 1
5:15 26
....
What I would like to have:
Visitor Holiday
Date Time
2017-09-11 4:45 0 1
5:00 1 1
5:15 26 1
....
2017-09-12 4:45 0 0
5:00 1 0
5:15 26 0
....
Here is what I tried based on this previous answer:
df['Holiday'] = int(df.index.get_level_values(0) in holidays == True)
However my column 'Holiday' always has the value 0...
Thanks in advance!
Upvotes: 0
Views: 3925
Reputation: 402982
Your current solution should actually throw a ValueError
:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Note that the in
operator is meant for scalar values and is not meant to be used with pandas dataframes. pandas instead has a lot of conditional and bitwise operators that are overloaded, as well as a comprehensive suite of APIs for the purpose of conditional and boolean logic.
You can use np.where
and isin
.
df['Holiday'] = np.where(df.index.get_level_values(0).isin(holidays), 1, 0)
df
Visitor Holiday
Date Time
2017-09-11 4:45 0 1
5:00 1 1
5:15 26 1
2017-09-12 4:45 0 0
5:00 1 0
5:15 26 0
Upvotes: 2
Reputation: 77017
Use isin
by taking the date level from get_level_values
and use astype(int)
to convert boolean to integer.
In [192]: df['Holiday'] = df.index.get_level_values(0).isin(holidays).astype(int)
In [193]: df
Out[193]:
Visitor Holiday
Date Time
2017-09-11 4:45 0 1
5:00 1 1
5:15 26 1
2017-09-12 4:45 0 0
5:00 1 0
5:15 26 0
If you want a copy instead of modifying df
In [196]: df.assign(Holiday=df.index.get_level_values(0).isin(holidays).astype(int))
Out[196]:
Visitor Holiday
Date Time
2017-09-11 4:45 0 1
5:00 1 1
5:15 26 1
2017-09-12 4:45 0 0
5:00 1 0
5:15 26 0
Upvotes: 2