Reputation: 215
I'm new to pandas. My df looks like this:
A A A B B B
a NaN NaN 2 NaN NaN 5
b NaN 1 NaN 9 NaN NaN
c 3 NaN NaN 7 NaN
How can I get
A B
a 2 5
b 1 9
c 3 7
It looks like merge, join are for more than one dataframe. I have also tried
df.groupby(by=[A,B], axis=1)
but got
ValueError: Grouper and axis must be same length
Upvotes: 3
Views: 8469
Reputation: 862581
I believe you need specify first level with aggregate function like sum
, mean
, first
, last
...:
import pandas as pd
df = df.groupby(level=0, axis=1).sum()
print (df)
A B
a 2.0 5.0
b 1.0 9.0
c 3.0 7.0
And if need filter columns by names use subset:
df = df[['A','B']].groupby(level=0, axis=1).sum()
If working with index values:
df1 = df.T
print (df1)
a b c
A NaN NaN 3.0
A NaN 1.0 NaN
A 2.0 NaN NaN
B NaN 9.0 7.0
B NaN NaN NaN
B 5.0 NaN NaN
df = df1.groupby(level=0).sum()
#default parameter axis=0 should be omit above
#df = df1.groupby(level=0, axis=0).sum()
print (df)
a b c
A 2.0 1.0 3.0
B 5.0 9.0 7.0
Upvotes: 6
Reputation: 164623
One clean way is to use a list comprehension with numpy.isfinite
:
import pandas as pd, numpy as np
arr = [list(filter(np.isfinite, x)) for x in df.values]
res = pd.DataFrame(arr, columns=['A', 'B'], index=['a', 'b', 'c'], dtype=int)
Result:
A B
a 2 5
b 1 9
c 3 7
Upvotes: 0
Reputation: 323226
Maybe using first
df.groupby(df.columns,axis=1).first()
Out[35]:
A B
a 2.0 5.0
b 1.0 9.0
c 3.0 7.0
Upvotes: 2