Reputation: 215

How to join columns sharing the same name within a dataframe

I'm new to pandas. My df looks like this:

  A   A   A   B   B   B
a NaN NaN 2   NaN NaN 5
b NaN 1   NaN 9   NaN NaN
c 3   NaN     NaN 7   NaN

How can I get

It looks like merge, join are for more than one dataframe. I have also tried

df.groupby(by=[A,B], axis=1)

but got

ValueError: Grouper and axis must be same length

Upvotes: 3

Answers (3)

jezrael

Reputation: 862581

I believe you need specify first level with aggregate function like sum, mean, first, last...:

import pandas as pd

df = df.groupby(level=0, axis=1).sum()
print (df)
     A    B
a  2.0  5.0
b  1.0  9.0
c  3.0  7.0

And if need filter columns by names use subset:

df = df[['A','B']].groupby(level=0, axis=1).sum()

If working with index values:

df1 = df.T
print (df1)
     a    b    c
A  NaN  NaN  3.0
A  NaN  1.0  NaN
A  2.0  NaN  NaN
B  NaN  9.0  7.0
B  NaN  NaN  NaN
B  5.0  NaN  NaN

df = df1.groupby(level=0).sum()
#default parameter axis=0 should be omit above
#df = df1.groupby(level=0, axis=0).sum()
print (df)
     a    b    c
A  2.0  1.0  3.0
B  5.0  9.0  7.0

Upvotes: 6

jpp

Reputation: 164623

One clean way is to use a list comprehension with numpy.isfinite:

import pandas as pd, numpy as np

arr = [list(filter(np.isfinite, x)) for x in df.values]

res = pd.DataFrame(arr, columns=['A', 'B'], index=['a', 'b', 'c'], dtype=int)

Result:

Upvotes: 0

BENY

Reputation: 323226

Maybe using first

df.groupby(df.columns,axis=1).first()
Out[35]: 
     A    B
a  2.0  5.0
b  1.0  9.0
c  3.0  7.0

Upvotes: 2

How to join columns sharing the same name within a dataframe

Answers (3)

Related Questions