Reputation: 3938
I have a dataframe df
that contains a column of dates in a string format like '2011-12-13'
and a column of time, again in a string format, like '15:40:00'
.
df
index date time
2011-01-03 09:40:00 2011-01-03 09:40:00
2011-01-03 09:45:00 2011-01-03 09:45:00
2011-01-03 09:50:00 2011-01-03 09:50:00
2011-01-03 09:55:00 2011-01-03 09:55:00
2011-01-03 10:00:00 2011-01-03 10:00:00
2011-01-03 10:05:00 2011-01-03 10:05:00
My objective is to create a colum F0
in my dataframe where F0=1
if the date belongs to any of these dates ('2011-01-26','2011-03-15', '2011-08-09', '2011-09-21', '2011-12-13')
and if the time ='9:40:00'
.
I am trying to use the numpy function where
as follow:
dates = ['2011-01-26','2011-03-15', '2011-08-09', '2011-09-21', '2011-12-13']
df['F1'] = np.where((df.date == any(dates) & (df.time== '9:40:00'), 1, 0))
I get this error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Why? I don't know how to use the any
function correctly.
I want to create multiple columns of F2
, F3
, and so on for other time
interval like:
df['F77'] = np.where((df.date == any(dates) & (df.time== '16:00:00'), 1, 0))
Upvotes: 3
Views: 520
Reputation: 251383
You don't need to use where
. Just use isin
and apply your condition directly to the columns:
df['F1'] = df.date.isin(dates) & (df.time=='09:40:00')
Upvotes: 4