Reputation: 8206
I have to following pandas dataframe:
a
1.0
1.5
1.3
1.2
1.9
0.8
Then I want to apply my new custom function to this column, which has a window
parameter, I mean, it only has to treat n items from the starting point:
def hislack(x, window):
# I only want to work with the last n items
x = x[:-window,]
# and do some stuff (this is a nosense example, just a simple sum)
r = np.sum(x)
return r
So to apply this function into a new column called b
I used this:
df['b'] = hislack(df['a'].values, 3)
But it returns the following:
a b
1.0 3.9
1.5 3.9
1.3 3.9
1.2 3.9
1.9 3.9
0.8 3.9
Which is the result of only the last row: 0.8 + 1.9 + 1.2 = 3.9
So the expected output would be:
a b
1.0 Nan
1.5 Nan
1.3 3.8
1.2 4.0
1.9 4.4
0.8 3.9
How may I prevent applying same result of the formula for all the rows?
Upvotes: 1
Views: 1861
Reputation: 215117
You need DataFrame.rolling:
df['a'].rolling(3).sum() # here 3 is the window parameter for your function and sum
# is the function/operation you want to apply to each window
#0 NaN
#1 NaN
#2 3.8
#3 4.0
#4 4.4
#5 3.9
#Name: a, dtype: float64
Or:
df['a'].rolling(3).apply(sum)
More generally you can do: df['a'].rolling(window).apply(fun)
where you pass the window
parameter to rolling
and the function to apply
.
Upvotes: 3