Reputation: 129
I have a data frame which looks like below:
Sales Look_back_period
0 100 1
1 102 1
2 103 2
3 108 2
4 100 3
5 90 4
6 97 6
The output I want:
Sales Look_back_period lag_sales
0 100 1 NaN
1 102 1 100
2 103 2 100
3 108 2 102
4 100 3 102
5 90 4 102
6 97 6 100
I want to shift the sales values based on Look_back_period
.
What I am doing is:
df[‘lag_sales’] = df.Sales.shift(df.look_back_period)
but it is not working.
Upvotes: 3
Views: 387
Reputation: 394041
Actually you can just reindex
by subtracting the look_back_period values from the index:
In[18]:
df['lag_sales'] = df['Sales'].reindex(df.index - df['Look_back_period']).values
df
Out[18]:
Sales Look_back_period lag_sales
0 100 NaN NaN
1 102 100.0 100.0
2 103 100.0 100.0
3 108 102.0 102.0
4 100 102.0 102.0
5 90 102.0 102.0
6 97 100.0 100.0
We have to convert to a numpy array using .values
because without this you will get a ValueError: cannot reindex from a duplicate axis
error
Upvotes: 6
Reputation: 164673
Here's one solution using a list comprehension with enumerate
:
df = pd.DataFrame.from_dict({'Look_back_period': {0: 1, 1: 1, 2: 2, 3: 2, 4: 3, 5: 4, 6: 6},
'Sales': {0: 100, 1: 102, 2: 103, 3: 108, 4: 100, 5: 90, 6: 97}})
df['lage_sales'] = [np.nan] + [df['Sales'].iat[i-j] for i, j in \
enumerate(df['Look_back_period'].iloc[1:], 1)]
print(df)
Sales Look_back_period lage_sales
0 100 1 NaN
1 102 1 100.0
2 103 2 100.0
3 108 2 102.0
4 100 3 102.0
5 90 4 102.0
6 97 6 100.0
Upvotes: 2