Reputation: 1
I have a dataset like this:
Customer ID | Date | Profit |
---|---|---|
1 | 4/13/2018 | 10.00 |
1 | 4/26/2018 | 13.27 |
1 | 10/23/2018 | 15.00 |
2 | 1/1/2017 | 7.39 |
2 | 7/5/2017 | 9.99 |
2 | 7/7/2017 | 10.01 |
3 | 5/4/2019 | 30.30 |
I'd like to groupby and sum profit, for every 6 months, starting at each users first transaction.
The output ideally should look like this:
Customer ID | Date | Profit |
---|---|---|
1 | 4/13/2018 | 23.27 |
1 | 10/13/2018 | 15.00 |
2 | 1/1/2017 | 7.39 |
2 | 7/1/2017 | 20.00 |
3 | 5/4/2019 | 30.30 |
The closest I've seem to get on this problem is by using:
df.groupby(['Customer ID',pd.Grouper(key='Date', freq='6M', closed='left')])['Profit'].sum().reset_index()
But, that doesn't seem to sum starting on a users first transaction day.
If the changing of dates is not possible (ex. customer 2 date is 7/1/2017 and not 7/5/2017), then at least summing the profit so that its based on each users own 6 month purchase journey would be extremely helpful. Thank you!
Upvotes: 0
Views: 136
Reputation: 3708
I can get you the first of the month until you find a more perfect solution.
df["Date"] = pd.to_datetime(df["Date"], format="%m/%d/%Y")
df = (
df
.set_index("Date")
.groupby(["Customer ID"])
.Profit
.resample("6MS")
.sum()
.reset_index(name="Profit")
)
print(df)
Customer ID Date Profit
0 1 2018-04-01 23.27
1 1 2018-10-01 15.00
2 2 2017-01-01 7.39
3 2 2017-07-01 20.00
4 3 2019-05-01 30.30
Upvotes: 1