Florent
Florent

Reputation: 1938

How to sum columns with a duplicate name with Pandas?

I have a dataframe with duplicate column name and I would like to sum these columns.

>df

      A  B  A  B
1    12  2  4  1
2    10  5  4  9
3     2  1  4  8
4     2  4  3  8

What i would like is something like this:

      A   B  
1    16   3  
2    14  14  
3     6   9  
4     5  12 

I can select duplicate columns in a loop but I don't know how to remove the columns and recreate a new column with summed values. I would like to know if there a more elegant way?

col = list(df.columns)
dup = list(set([x for x in col if col.count(x) > 1]))
for d in dup:
    sum = df[d].sum(axis=1)

Upvotes: 0

Views: 250

Answers (2)

FAHAD SIDDIQUI
FAHAD SIDDIQUI

Reputation: 641

Try this

df.groupby(lambda x:x, axis=1).sum()

Upvotes: 1

BENY
BENY

Reputation: 323386

Let us try

sum_df=df.sum(level=0,axis=1)

Upvotes: 2

Related Questions