Karan Arya
Karan Arya

Reputation: 104

Multiple Levels of Groupby and Sum

I have a dataframe with the following initialization and I want to find if the values in column 'a' are present in column 'b'. Then, if the value is present, I want the sum of all the corresponding values of column 'c'.

NA

df = pd.DataFrame({'a': [1,2,3, 1, 4, 1, 2],
              'b': [1,5,1, 2, 3, 1, 3],
              'c': [10,20,40, 50, 60, 70, 100]})

Sample result is in the image link below:

enter image description here

Upvotes: 1

Views: 129

Answers (3)

Mark Wang
Mark Wang

Reputation: 2757

df[['a','b']].join(df.groupby('b').c.sum(),on='a').fillna(0, downcast='infer')

or

df.groupby('b').c.sum().reindex(df.a,fill_value=0).reset_index().assign(b=df.b).sort_index(axis=1)

or

df.assign(c = df.groupby('b').c.sum().reindex(df.a, fill_value=0).reset_index(drop=True))

Upvotes: 1

rich
rich

Reputation: 570

This will sum the values for column c for each value in column b, as long as the value in b exists in a.

import pandas as pd


df = pd.DataFrame({'a': [1,2,3, 1, 4, 1, 2],
              'b': [1,5,1, 2, 3, 1, 3],
              'c': [10,20,40, 50, 60, 70, 100]})


new_df = df[['a']].drop_duplicates().merge(df[['b', 'c']], left_on = 'a', right_on = 'b', how = 'left').groupby('a', as_index = False)['c'].sum()

Upvotes: 0

ivallesp
ivallesp

Reputation: 2222

Take a look at the following one-liner :D

df[["a"]].merge(df.groupby("b").c.sum().reset_index().rename(columns={"b":"a"}), how="left").fillna(0)

Upvotes: 0

Related Questions