Pandas dataframe groupby

Question

I am a beginner in Pandas so please bear with me. I know this is a very basic question/

I am working with pandas on the following dataframe :

x      y             w  

1      2             5                 
1      2             7         
3      4             3        
5      4             8    
3      4             5    
5      9             9

And I want the following output :

x   y   w   

1   2   5,7    
3   4   2,5    
5   4   8    
5   9   9

Can Anyone tell me how to do it using pandas groupby.

jezrael · Accepted Answer

You can use groupby with apply join:

#if type of column w is not string, convert it
print type(df.at[0,'w'])


df['w'] = df['w'].astype(str)

print df.groupby(['x','y'])['w'].apply(','.join).reset_index()
   x  y    w
0  1  2  5,7
1  3  4  3,5
2  5  4    8
3  5  9    9

If you have duplicates, use drop_duplicates:

print df
   x  y  w
0  1  2  5
1  1  2  5
2  1  2  5
3  1  2  7
4  3  4  3
5  5  4  8
6  3  4  5
7  5  9  9

df['w'] = df['w'].astype(str)
print df.groupby(['x','y'])['w'].apply(lambda x: ','.join(x.drop_duplicates()))
        .reset_index()

   x  y    w
0  1  2  5,7
1  3  4  3,5
2  5  4    8
3  5  9    9

Or modified EdChum solution:

print df.groupby(['x','y'])['w'].apply(lambda x: ','.join(x.astype(str).drop_duplicates()))
        .reset_index()

   x  y    w
0  1  2  5,7
1  3  4  3,5
2  5  4    8
3  5  9    9

Pandas dataframe groupby

Answers (2)

Related Questions