Reputation: 1309
fruit year value
apple 1950 2
apple 1951 3
apple 1952 3
apple 1953 4
apple 1954 5
apple 1955 4
banana 1950 333
banana 1951 335
How do i add a boolean if the value is lower than the value for the row above, per fruit? And if so the average of the previous 5 years
Expected Output
fruit year value lower_than_before 5 year avg
apple 1950 2
apple 1951 3
apple 1952 3
apple 1953 4
apple 1954 5
apple 1955 4 True 3.4
banana 1950 333
banana 1951 335
Upvotes: 2
Views: 76
Reputation: 153460
Try this:
g = df.groupby('fruit')
df['lower_than_before'] = g['value'].diff().lt(0)
df['5 year avg'] = g['value'].apply(lambda x: x.rolling(5).mean().shift().where(df['lower_than_before'].cummax()))
df
Output:
fruit year value lower_than_before 5 year avg
0 apple 1950 2 False NaN
1 apple 1951 3 False NaN
2 apple 1952 3 False NaN
3 apple 1953 4 False NaN
4 apple 1954 5 False NaN
5 apple 1955 4 True 3.4
6 banana 1950 333 False NaN
7 banana 1951 335 False NaN
Upvotes: 2
Reputation: 142
Use shift function and handle the NaN, something like this:
df['lower_than_before'] = np.where((df.value - df.value.shift(1)) < 0, True, '')
df['5 year avg'] = np.where(df.lower_than_before=='True', df.value.rolling(window=5, min_periods=0).mean().shift(1), '')
Output:
fruit year value lower_than_before 5 year avg
0 apple 1950 2
1 apple 1951 3
2 apple 1952 3
3 apple 1953 4
4 apple 1954 5
5 apple 1955 4 True 3.4
6 banana 1950 333
7 banana 1951 335
Upvotes: 1