Reputation: 1998
I have df as
Sales | Users | DT
40000 234 1/4/2018
500 100 1/23/2018
4000 555 2/4/2018
234 1000 6/4/2018
1000 2000 6/14/2018
50000 1200 9/12/2018
40000 234 1/14/2019
500 100 2/23/2019
2145 555 2/26/2019
344 1000 6/14/2019
1234 1233 6/21/2019
432 5000 11/12/2019
Is there method to group-by to both sum up sales by yr and also create a rolling total of users as my users column is composed of both new users, and users from previous date so basically group by year, sum sales, and then sum users from next date - prev date users
Thank you
Upvotes: 0
Views: 69
Reputation: 13518
With the dataframe you provided:
import pandas as pd
df = pd.DataFrame(
{
"Sales": [40000, 500, 4000, 234, 1000, 50000, 40000, 500, 2145, 344, 1234, 432],
"Users": [234, 100, 555, 1000, 2000, 1200, 234, 100, 555, 1000, 1233, 5000],
"DT": ["1/4/2018", "1/23/2018", "2/4/2018", "6/4/2018", "6/14/2018", "9/12/2018", "1/14/2019", "2/23/2019", "2/26/2019", "6/14/2019", "6/21/2019", "11/12/2019"],
}
)
Here is one way to do it using to_datetime and cumsum:
df["DT"] = pd.to_datetime(df["DT"]).dt.year
df = (
df.groupby("DT")
.agg(sum)
.pipe(lambda df_: df_.assign(Users=df_["Users"].cumsum()))
.reset_index()
)
print(df)
# Output
DT Sales Users
0 2018 95734 5089
1 2019 44655 13211
Upvotes: 1