user1039698
user1039698

Reputation: 173

How to sum list of pandas dataframes by with respect to given column

I have list of pandas dataframes with two columns, basically class and value:

df1:

Name Count
Bob 10
John 20

df2:

Name Count
Mike 30
Bob 40

There might be same "Names" in different dataframes, there might be no same "Names", and list contains over 100 dataframes. But in each dataframe all "Names" are unique.

What I need is to iterate over all dataframes and create one big one, where presented all values from "Names" and their total sums of "Count" from all the dataframes, so like:

result:

Name Count
Bob 50
John 20
Mike 30

Bob's data is summed, others are not, as they are only present once. Is there efficient way once there are many dataframes?

Upvotes: 1

Views: 812

Answers (2)

adir abargil
adir abargil

Reputation: 5745

you can do the following (assuming you have ,more data that only conatined in one dataframe use fill_value=0 to still provide value..:

df1.set_index('Name').add(df2.set_index('Name'), fill_value=0).reset_index()

>>> Name    Count
0   Bob     50.0
1   John    20.0
2   Mike    30.0

Upvotes: 2

ABC
ABC

Reputation: 645

do pd.concat then groupby:

df = pd.concat(dfs) # where dfs is a list of dataframes 

then you can do

gp = df.groupby(['Name'])['Count'].sum()

Upvotes: 3

Related Questions