agf1997
agf1997

Reputation: 2898

apply(list) to multiple columns in pandas

I currently have a dataframe that looks like this :

df = pd.DataFrame({'A': [1,1,2,2,2,2,3],
                   'B':['a','b','c','d','e','f','g'],
                   'C':['1','2','3','4','5','6','7']})

df2 = df.groupby('A')['B'].apply(list).reset_index() produces this :

   A             B
0  1        [a, b]
1  2  [c, d, e, f]
2  3           [g]

How can I produce this?

   A             B            C
0  1        [a, b]       [1, 2]
1  2  [c, d, e, f] [3, 4, 5, 6]
2  3           [g]          [7]

Upvotes: 2

Views: 278

Answers (2)

Jon Clements
Jon Clements

Reputation: 142106

Use:

df.groupby('A', as_index=False).agg(list)

This gives you:

   A             B             C
0  1        [a, b]        [1, 2]
1  2  [c, d, e, f]  [3, 4, 5, 6]
2  3           [g]           [7]

Upvotes: 3

creanion
creanion

Reputation: 2733

You can do it like this

df.groupby('A', as_index=False).agg(B=("B", list), C=("C", list))
   A             B             C
0  1        [a, b]        [1, 2]
1  2  [c, d, e, f]  [3, 4, 5, 6]
2  3           [g]           [7]

Or equivalently

pd.pivot_table(data=df, index="A", values=["B", "C"], aggfunc=list).reset_index()

Upvotes: 3

Related Questions