Reputation: 163
I have the following data frame, per each date, per hour, I want create a new column "result"such that if the value in column "B" is >=0 then use the value in column A; otherwise use the maximum between 0 and the previous row value in column B
Date Hour A B result
1/1/2018 1 5 95 5
1/1/2018 1 16 79 16
1/1/2018 1 85 -6 79
1/1/2018 1 12 -18 0
1/1/2018 2 17 43 17
1/1/2018 2 17 26 17
1/1/2018 2 16 10 16
1/1/2018 2 142 -132 10
1/1/2018 2 10 -142 0
I tried grouping by date and hour and then applying a lambda function using shift but I got an error:
df['result'] = df.groupby(['Date','Hour']).apply(lambda x: x['A'] if x['B'] >= 0 else np.maximum(0, x['B'].shift(1)), axis = 1)
Upvotes: 3
Views: 1232
Reputation: 402553
Use np.where
. The groupby
is only necessary when shifting "B", so you can vectorise this operation without using apply
.
df['result'] = np.where(
df.B >= 0,
df.A,
df.groupby(['Date', 'Hour'])['B'].shift().clip(lower=0))
df
Date Hour A B result
0 1/1/2018 1 5 95 5.0
1 1/1/2018 1 16 79 16.0
2 1/1/2018 1 85 -6 79.0
3 1/1/2018 1 12 -18 0.0
4 1/1/2018 2 17 43 17.0
5 1/1/2018 2 17 26 17.0
6 1/1/2018 2 16 10 16.0
7 1/1/2018 2 142 -132 10.0
8 1/1/2018 2 10 -142 0.0
Upvotes: 5