Reputation: 25366

pandas: aggregate rows for a given column and count the number

I have the following data frame my_df:

team      member
--------------------    
 A         Mary
 B         John
 C         Amy
 A         Dan
 B         Dave
 D         Paul
 B         Alex
 A         Mary
 D         Mary

I want the new output the new data frame new_df as:

team      members              number
--------------------------------------
 A       [Mary,Dan]              2
 B       [John,Dave,Alex]        3
 C       [Amy]                   1
 D       [Paul,Mary]             2

I am wondering is there any existing pandas function can perform the above task? Thanks!

Upvotes: 7

Answers (3)

racket99

Reputation: 655

using lambda:

newdf=pd.DataFrame()
newdf['team']=my_df['team'].unique()
newdf['members']=newdf['team'].map(lambda x:list(my_df[my_df['team']==x]['member']))
newdf['number']=newdf.members.map(lambda x: len(x))
newdf.set_index('team',inplace=True)

Upvotes: 1

akuiper

Reputation: 214957

Another option here:

(df.groupby("team", as_index=False).member
   .agg({"member": lambda x: list(x), "count": "count"}))

Upvotes: 3

piRSquared

Reputation: 294278

using groupby

pd.concat

g = df.groupby('team').member
pd.concat([g.apply(list), g.count()], axis=1, keys=['members', 'number'])

agg

g = df.groupby('team').member
g.agg(dict(members=lambda x: list(x), number='count'))

                 members  number
team                            
A            [Mary, Dan]       2
B     [John, Dave, Alex]       3
C                  [Amy]       1
D                 [Paul]       1

Upvotes: 8

pandas: aggregate rows for a given column and count the number

Answers (3)

Related Questions