Create running total with multiple columns in Python

Question

I have a dataset, where I would like to sum multiple columns and then create multiple columns with a running total. The output has to be grouped by id and date.

Data

id  date    t1  t2  total   start   curr_t2 curr_t2 
a   q1 22   4   1   5       50      25      20  
a   q2 22   1   1   2       50      25      20  
a   q3 22   0   0   0       50      25      20  
b   q1 22   5   5   10      100     30      40  
b   q2 22   2   2   4       100     30      70  
b   q3 22   3   4   7       100     30      70

Desired

id  date    t1  t2  total   start   cur_t1  cur_t2  final   finalt1 finalt2
a   q1 22   4   1   5       50      25      20      55      29      21
a   q2 22   1   1   2       50      25      20      57      30      22
a   q3 22   0   0   0       50      25      20      57      30      22
b   q1 22   5   5   10      100     30      40      110     35      27
b   q2 22   2   2   4       100     30      70      114     37      29
b   q3 22   3   4   7       100     30      70      121     40      33


**Logic:**

sum the 'start' and 'total' column to create the 'final' column
however, final column is a running total, adding to the 'total' column.

EX.

start = 50 total = 5 so final = 55
then
the next row we have a total of 2 , so the final for the next row is 57 

Similar logic with the newly created: finalt1 and finalt2

etc

Doing

I believe I have to use the .cumsum function to obtain the desired output The output has to be grouped by id and date

out['final'] = out['final'].sub(out.groupby('id')['start'].cumsum())

However, this output is not grouping by pod and date. I am still researching, any suggestion is appreciated.

BENY · Accepted Answer

Try with

df['final'] = df['start'].add(df.groupby('id')['total'].cumsum())
df['final1'] = df['t1'].add(df.groupby('id')['curr_t1'].cumsum())
df['final2'] = df['t2'].add(df.groupby('id')['curr_t2'].cumsum())

Create running total with multiple columns in Python

Answers (1)

Related Questions