Buzi
Buzi

Reputation: 248

Is there a way in Pandas to use previous row values in dataframe.apply where previous values are also calculated in the apply?

I have the following dataframe:

      W    Y
 0    1    5
 1    2    NaN
 2    3    NaN
 3    4    NaN
 4    5    NaN
 5    6    NaN
 6    7    NaN
 ...

as the table rows keeps going until index 240. I want to get the following dataframe:

      W    Y
 0    1    5
 1    2    7
 2    3    10
 3    4    14
 4    5    19
 5    6    27
 6    7    37
 ...

Please note that the values of W are arbitrary (just to make the computation here easier, in fact they are np.random.normal in my real program).
Or in other words:
If Y index is 0, then the value of Y is 5;
If Y index is between 1 and 4 (includes) then Y_i is the sum of the previous element in Y and the current elemnt in W.
If Y index is >=5 then the value of Y is: Y_{i-1} + Y_{i-4} - Y_{i-5} + W_i

using iipr answer I've managed to compute the first five values by running:

def calculate(add):
    global value
    value = value + add
    return value

df.Y = np.nan
value = 5
df.loc[0, 'Y'] = value
df.loc[1:5, 'Y'] = df.loc[1:5].apply(lambda row: calculate(*row[['W']]), axis=1)

but I haven't managed to compute the rest of values (where index>=5).
Does anyone have any suggestions?

Upvotes: 2

Views: 499

Answers (1)

SpghttCd
SpghttCd

Reputation: 10860

I wouldn't recommend to use apply in this case.
Why not simply use two loops, for each differently defined range one:

for i in df.index[1:5]:
    df.loc[i, 'Y'] = df.W.loc[i] + df.Y.loc[i-1]
for i in df.index[5:]:
    df.loc[i, 'Y'] = df.W.loc[i] + df.Y.loc[i-1] + df.Y.loc[i-4] - df.Y.loc[i-5]

This is straight forward and you still know next week what the code does.

Upvotes: 3

Related Questions