Cumulative Sum Function on Pandas Data Frame

Question

I am attempting to capture a "running" cumulative sum given a series of period amounts.

See example:

df = df[1:4].cumsum() # this doesn't return the desired result

Brian · Accepted Answer

You're looking for the axis parameter. Many Pandas functions take this argument to apply an operation across the columns or across the rows. Use axis=0 to apply row-wise and axis=1 to apply column-wise. This operation is actually traversing the columns, so you want axis=1.

df.cumsum(axis=1) by itself works on your example to produce the output table.

In [3]: df.cumsum(axis=1)
Out[3]:
      1   2   3   4
10   16  30  41  61
51   13  29  40  50
13   11  30  45  61
321  12  27  37  52

I suspect you're interested in restricting to a specific range of columns, though. To do that, you can use .loc with the column labels (strings in mine).

In [4]: df.loc[:, '2':'3'].cumsum(axis=1)
Out[4]:
      2   3
10   14  25
51   16  27
13   19  34
321  15  25

.loc is label-based and is inclusive of the bounds. If you want to find out more about indexing in Pandas, check the docs.

Cumulative Sum Function on Pandas Data Frame

Answers (2)

Related Questions