Reputation: 549
How to sum the pandas rows from data_in
in order to get panda rows similar to the lines of data_out
?
data_in = [
{ 'col-a':'a1', 'col-b':'b1', 'col-z':'z1', 'value':1},
{ 'col-a':'a1', 'col-b':'b1', 'col-z':'z1', 'value':2},
{ 'col-a':'a2', 'col-b':'b2', 'col-z':'z2', 'value':10},
{ 'col-a':'a2', 'col-b':'b2', 'col-z':'z2', 'value':20}
]
df = pd.DataFrame(data_in)
# which operation to apply on df to get rows like in data_out?
# ...
data_out = [
{ 'col-a':'a1', 'col-b':'b1', 'col-z':'z1', 'value':3},
{ 'col-a':'a2', 'col-b':'b2', 'col-z':'z2', 'value':30}
]
Upvotes: 0
Views: 190
Reputation: 655
This is the line you are looking for:
df.groupby(["col-a", "col-b", "col-z"])["value"].sum()
Upvotes: 1
Reputation: 42916
This is an aggregating problem. You can use .groupby
in pandas and while aggregating you can get the sum of these groups with .value.sum()
df_out = df.groupby(['col-a', 'col-b', 'col-z']).value.sum().reset_index()
print(df_out)
col-a col-b col-z value
0 a1 b1 z1 3
1 a2 b2 z2 30
Upvotes: 2