Reputation: 5940
I have a dataframe df1
that looks like this:
df1 = pd.DataFrame({'A':[0,5,4,8,9,0,7,6],
'B':['a','s','d','f','g','h','j','k'],
'C':['XX','XX','XX','YY','YY','WW','ZZ','ZZ']})
My goal is to group the elements according to the values contained in column C
so that rows having the same value, have the same index (which must contain the value stored in C
). Therefore the output should be like this:
A B
XX 0 a
5 s
4 d
YY 8 f
9 g
WW 0 h
ZZ 7 j
6 k
I tried to use the command df.groupby('C')
but it returns the following object:
<pandas.core.groupby.DataFrameGroupBy object at 0x000000001A9D4860>
Can you suggest me an elegant and smart way to achieve my goal?
Note: I think my question is somehow related to multi-indexing
Upvotes: 2
Views: 7485
Reputation: 862661
It seems you need DataFrame.set_index
df2 = df1.set_index('C')
print (df2)
A B
C
XX 0 a
XX 5 s
XX 4 d
YY 8 f
YY 9 g
WW 0 h
ZZ 7 j
ZZ 6 k
print (df2.loc['XX'])
A B
C
XX 0 a
XX 5 s
XX 4 d
If need MultiIndex
from columns C
and A
:
df3 = df1.set_index(['C', 'A'])
print (df3)
B
C A
XX 0 a
5 s
4 d
YY 8 f
9 g
WW 0 h
ZZ 7 j
6 k
print (df3.loc['XX'])
B
A
0 a
5 s
4 d
Upvotes: 4
Reputation: 30605
I think you are looking for pivot_table
i.e
pd.pivot_table(df1, values='A', index=['C','B'])
Output :
A C B WW h 0 XX a 0 d 4 s 5 YY f 8 g 9 ZZ j 7 k 6
Upvotes: 2