Hasan A Yousef
Hasan A Yousef

Reputation: 24948

How to add lagging data to the data frame

Let's say I've the below data frame:

df = pd.DataFrame(
    np.random.randn(10, 3),
    index=pd.date_range("1/1/2022", freq="D", periods=10),
    columns=["A", "B", "C"],
    )

df

enter image description here

I want to add column D, so that we check the value of B at each date, if it is positive then D equals B of the previous day, if it is negative, then it is max of A and C of the previous day.

For example:

  1. At 2022-01-05, B is positive, then D should be equal to the value of B at day 2022-01-04 that is -1.329675
  2. At 2022-01-03, B is negative, then D should be equal to the max of A and C at day 2022-01-02 that is 0.562377

Upvotes: 0

Views: 43

Answers (1)

Ynjxsjmh
Ynjxsjmh

Reputation: 30032

You can try shift with np.where

df['D'] = np.where(df['B'] > 0,
                   df['B'].shift(),
                   df[['A', 'C']].shift().max(axis=1))
print(df)

                   A         B         C         D
2022-01-01  1.495712  0.268461 -0.954359       NaN
2022-01-02 -0.585978 -0.214993  1.110485  1.495712
2022-01-03 -0.038061 -0.779959 -0.445906  1.110485
2022-01-04 -1.145300  0.436302 -0.480348 -0.779959
2022-01-05 -0.297486 -0.013070 -0.416821 -0.480348
2022-01-06  0.244772  0.288131  0.439024 -0.013070
2022-01-07  2.887469  0.608421 -0.432701  0.288131
2022-01-08  0.414783 -0.031525 -2.629204  2.887469
2022-01-09  0.786881  0.505138 -0.817680 -0.031525
2022-01-10 -1.097626  0.750188  1.216392  0.505138

Upvotes: 2

Related Questions