Smidem
Smidem

Reputation: 42

What's the best way to calculate a daily cumulative sum?

I found a way to get the number I need, but I was hoping to get some input on how to accomplish it in a less cumbersome way. I need a running total of transactions to date in order to make it into a plotly plot. The data I have only includes a few columns: id, date, and amount. Here's the code I have so far:

fy20 = pd.read_excel('./data/transactions.xlsx', parse_dates=['date'])
def daily_money(df):
    df = df.groupby('date').amount.sum()
    df = df.groupby(df.index.day).cumsum()
    df = df.cumsum().to_frame().reset_index()
    return df

fy20 = daily_money(fy20)

This appears to accomplish the goal, but it seems like there must be a simpler way. Please let me know if you have any suggestions on how to simplify this.

Upvotes: 0

Views: 974

Answers (1)

jirassimok
jirassimok

Reputation: 4263

It looks to me like this should work:

df.groupby('date')['amount'].sum().cumsum()

This works because DataFrame.groupby automatically sorts by the group keys, so the cumulative sum is already looking at the data it needs.

If you want it as a DataFrame with a new index instead of a Series, you can just use Series.reset_index, which converts the series to a DataFrame first, but unless you need the date as a normal column (rather than the index) later, you don't need to do that.

Upvotes: 1

Related Questions