Reputation: 1945
Good Evening,
I have the following dataframe:
print(dd)
dt_op quantity product_code
20/01/18 1 613
21/01/18 8 611
21/01/18 1 613
...
I am trying to get the lagged Sales, but the following code does not compute them by product_code as well:
dd["Lagged_Sales"] = [dd.loc[dd['dt_op'].between(d - pd.Timedelta(days = 15), d), 'quantity'].sum() \
for d in dd['dt_op']]
I would like to define dd["Lagged_Sales"] as sum of "quantity" sold in the past 15 days, for **every different product in stock;
ultimately, for i in "dt_op" and "product_code".
print(final_dd)
dt_op quantity product_code Lagged Sales
20/01/18 1 613 1
21/01/18 8 611 8
21/01/18 1 613 2
...
Thanks
Upvotes: 0
Views: 60
Reputation: 51165
Using rolling
with a 15d
frequency:
df.set_index('dt_op').groupby('product_code').rolling('15d').quantity.sum()
product_code dt_op
611 2018-01-21 8.0
613 2018-01-20 1.0
2018-01-21 2.0
Name: quantity, dtype: float64
Upvotes: 1
Reputation: 402813
IIUC, group with pd.Grouper
and "product_code":
df.dt_op = pd.to_datetime(df.dt_op, errors='coerce')
df.groupby([pd.Grouper(key='dt_op', freq='15D'), 'product_code']).quantity.sum()
dt_op product_code
2018-01-20 611 8
613 2
Name: quantity, dtype: int64
Upvotes: 1