Reputation: 8247
I have following dataframe in pandas
code tank nozzle_1 nozzle_2 nozzle_var
123 1 1 1 10
123 1 2 2 12
123 2 1 1 10
123 2 2 2 12
I want to calculate cumulative sum of columns nozzle_1 and nozzle_2 grouping over tank. Following is my desired dataframe.
code tank nozzle_1 nozzle_2 nozzle_var nozzle_1_cumsum nozzle_2_cumsum
123 1 1 1 10 1 1
123 1 2 2 12 3 3
123 2 1 1 10 1 1
123 2 2 2 12 3 3
I am getting nozzle_1 and nozzle_2 from following code in pandas
cols= df.columns[df.columns.str.contains(pat='nozzle_\d+$', regex=True)]
How can I calculate cumsum from above list of columns
Upvotes: 1
Views: 67
Reputation: 5451
import pandas as pd
d = {
"tank":[1,1,1,2],
"nozzle_1":[1,2,1,2],
"nozzle_2":[1,2,1,2]
}
df = pd.DataFrame(d)
columns = df.columns[df.columns.str.match("nozzle_\d+$")]
df[columns + "_cumsum"] = \
df.groupby("tank")[columns].cumsum(axis=0)
df
Output
tank nozzle_1 nozzle_2 nozzle_1_cumsum nozzle_2_cumsum
0 1 1 1 1 1
1 1 2 2 3 3
2 1 1 1 4 4
3 2 2 2 2 2
Upvotes: 1
Reputation: 153460
How about this fancy solution:
cols= df.columns[df.columns.str.contains(pat='nozzle_\d+$', regex=True)]
df.assign(**df.groupby('tank')[cols].agg(['cumsum'])\
.pipe(lambda x: x.set_axis(x.columns.map('_'.join), axis=1, inplace=False)))
Output:
tank nozzle_1 nozzle_2 nozzle_var nozzle_1_cumsum nozzle_2_cumsum
0 1 1 1 10 1 1
1 1 2 2 12 3 3
2 2 1 1 10 1 1
3 2 2 2 12 3 3
In steps:
df_cumsum = df.groupby('tank')[cols].agg(['cumsum'])
df_cumsum.columns = df_cumsum.columns.map('_'.join)
pd.concat([df, df_cumsum], axis=1)
Output:
tank nozzle_1 nozzle_2 nozzle_var nozzle_1_cumsum nozzle_2_cumsum
0 1 1 1 10 1 1
1 1 2 2 12 3 3
2 2 1 1 10 1 1
3 2 2 2 12 3 3
Upvotes: 2