Federico Gentile
Federico Gentile

Reputation: 5940

Group by column value and set it as index in Pandas

I have a dataframe df1 that looks like this:

df1 = pd.DataFrame({'A':[0,5,4,8,9,0,7,6],
                   'B':['a','s','d','f','g','h','j','k'],
                   'C':['XX','XX','XX','YY','YY','WW','ZZ','ZZ']})    

My goal is to group the elements according to the values contained in column Cso that rows having the same value, have the same index (which must contain the value stored in C). Therefore the output should be like this:

    A  B   
XX  0  a  
    5  s  
    4  d  
YY  8  f  
    9  g  
WW  0  h  
ZZ  7  j  
    6  k  

I tried to use the command df.groupby('C') but it returns the following object:

<pandas.core.groupby.DataFrameGroupBy object at 0x000000001A9D4860>

Can you suggest me an elegant and smart way to achieve my goal?

Note: I think my question is somehow related to multi-indexing

Upvotes: 2

Views: 7485

Answers (2)

jezrael
jezrael

Reputation: 862661

It seems you need DataFrame.set_index

df2 = df1.set_index('C')
print (df2)
    A  B
C       
XX  0  a
XX  5  s
XX  4  d
YY  8  f
YY  9  g
WW  0  h
ZZ  7  j
ZZ  6  k

print (df2.loc['XX'])
    A  B
C       
XX  0  a
XX  5  s
XX  4  d

If need MultiIndex from columns C and A:

df3 = df1.set_index(['C', 'A'])
print (df3)
      B
C  A   
XX 0  a
   5  s
   4  d
YY 8  f
   9  g
WW 0  h
ZZ 7  j
   6  k

print (df3.loc['XX'])
   B
A   
0  a
5  s
4  d

Upvotes: 4

Bharath M Shetty
Bharath M Shetty

Reputation: 30605

I think you are looking for pivot_table i.e

pd.pivot_table(df1, values='A', index=['C','B'])

Output :

      A
C  B   
WW h  0
XX a  0
   d  4
   s  5
YY f  8
   g  9
ZZ j  7
   k  6

Upvotes: 2

Related Questions