Reputation: 719
I have a dataframe like this,
ID 00:00 01:00 02:00 ... 23:00 avg_value
22 4.7 5.3 6 ... 8 5.5
37 0 9.2 4.5 ... 11.2 9.2
4469 2 9.8 11 ... 2 6.4
Can I use np.where
to apply conditions on multiple columns at once?
I want to update the values from 00:00 to 23:00 to 0
and 1
. If the value at the time of day is greater than avg_value
then I change it to 1
, else to 0
.
I know how to apply this method to one single column.
np.where(df['00:00']>df['avg_value'],1,0)
Can I change it to multiple columns?
Output will be like,
ID 00:00 01:00 02:00 ... 23:00 avg_value
22 0 1 1 ... 1 5.5
37 0 0 0 ... 1 9.2
4469 0 1 1 ... 0 6.4
Upvotes: 1
Views: 101
Reputation: 862661
Select all columns without last by DataFrame.iloc
, compare by DataFrame.gt
and casting to integer
s and last add avg_value
column by DataFrame.join
:
df = df.iloc[:, :-1].gt(df['avg_value'], axis=0).astype(int).join(df['avg_value'])
print (df)
00:00 01:00 02:00 23:00 avg_value
ID
22 0 0 1 1 5.5
37 0 0 0 1 9.2
4469 0 1 1 0 6.4
Or use DataFrame.pop
for extract column:
s = df.pop('avg_value')
df = df.gt(s, axis=0).astype(int).join(s)
print (df)
00:00 01:00 02:00 23:00 avg_value
ID
22 0 0 1 1 5.5
37 0 0 0 1 9.2
4469 0 1 1 0 6.4
Because if assign to same columns integers are converted to floats (it is bug):
df.iloc[:, :-1] = df.iloc[:, :-1].gt(df['avg_value'], axis=0).astype(int)
print (df)
00:00 01:00 02:00 23:00 avg_value
ID
22 0.0 0.0 1.0 1.0 5.5
37 0.0 0.0 0.0 1.0 9.2
4469 0.0 1.0 1.0 0.0 6.4
Upvotes: 2