Plug4
Plug4

Reputation: 3938

Python: numpy where command with if statement

I have a dataframe df that contains a column of dates in a string format like '2011-12-13' and a column of time, again in a string format, like '15:40:00'.

df

index                 date        time
2011-01-03 09:40:00   2011-01-03  09:40:00 
2011-01-03 09:45:00   2011-01-03  09:45:00 
2011-01-03 09:50:00   2011-01-03  09:50:00  
2011-01-03 09:55:00   2011-01-03  09:55:00 
2011-01-03 10:00:00   2011-01-03  10:00:00  
2011-01-03 10:05:00   2011-01-03  10:05:00  

My objective is to create a colum F0 in my dataframe where F0=1 if the date belongs to any of these dates ('2011-01-26','2011-03-15', '2011-08-09', '2011-09-21', '2011-12-13') and if the time ='9:40:00'.

I am trying to use the numpy function where as follow:

dates = ['2011-01-26','2011-03-15', '2011-08-09', '2011-09-21', '2011-12-13']

df['F1'] = np.where((df.date == any(dates) & (df.time== '9:40:00'), 1, 0))

I get this error: ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). Why? I don't know how to use the any function correctly.

I want to create multiple columns of F2, F3, and so on for other time interval like:

df['F77'] = np.where((df.date == any(dates) & (df.time== '16:00:00'), 1, 0))

Upvotes: 3

Views: 520

Answers (1)

BrenBarn
BrenBarn

Reputation: 251383

You don't need to use where. Just use isin and apply your condition directly to the columns:

df['F1'] = df.date.isin(dates) & (df.time=='09:40:00')

Upvotes: 4

Related Questions