Alessandro Ceccarelli
Alessandro Ceccarelli

Reputation: 1945

Lagged Sales for every product

Good Evening,

I have the following dataframe:

print(dd)
dt_op      quantity   product_code
20/01/18      1            613
21/01/18      8            611
21/01/18      1            613 
...

I am trying to get the lagged Sales, but the following code does not compute them by product_code as well:

dd["Lagged_Sales"] = [dd.loc[dd['dt_op'].between(d - pd.Timedelta(days = 15), d), 'quantity'].sum() \
                for d in dd['dt_op']]

I would like to define dd["Lagged_Sales"] as sum of "quantity" sold in the past 15 days, for **every different product in stock;

ultimately, for i in "dt_op" and "product_code".

print(final_dd)

dt_op      quantity   product_code     Lagged Sales
20/01/18      1            613               1
21/01/18      8            611               8
21/01/18      1            613               2
...

Thanks

Upvotes: 0

Views: 60

Answers (2)

user3483203
user3483203

Reputation: 51165

Using rolling with a 15d frequency:

df.set_index('dt_op').groupby('product_code').rolling('15d').quantity.sum()

product_code  dt_op
611           2018-01-21    8.0
613           2018-01-20    1.0
              2018-01-21    2.0
Name: quantity, dtype: float64

Upvotes: 1

cs95
cs95

Reputation: 402813

IIUC, group with pd.Grouper and "product_code":

df.dt_op = pd.to_datetime(df.dt_op, errors='coerce')
df.groupby([pd.Grouper(key='dt_op', freq='15D'), 'product_code']).quantity.sum()

dt_op       product_code
2018-01-20  611             8
            613             2
Name: quantity, dtype: int64

Upvotes: 1

Related Questions