Reputation: 768
I have a dataset like this:
>>> df = pd.DataFrame({'id_sin':['s123','s123','s124','s124'],
'raison':['first problem','second problem','album','dog']
})
>>> df
id_sin raison
0 s123 first problem
1 s123 second problem
2 s124 album
3 s124 dog
This is the expected output:
id_sin raison
0 s123 first problem, second problem
1 s124 album, dog
What I tried:
df['raison'] = df.groupby('id_sin')['raison'].apply(lambda x: ', '.join(x))
But doesn't work... what am I missing? Thanks for help!
Upvotes: 1
Views: 1445
Reputation: 153500
Try using agg
:
df.groupby('id_sin')['raison'].agg(', '.join).reset_index()
Output:
id_sin raison
0 s123 first problem, second problem
1 s124 album, dog
Upvotes: 3
Reputation: 150785
Try changing the groups to lists:
df.groupby(['id_sin']).raison.apply(lambda x: ', '.join(list(x)))
After testing your code, it turns out that you should not do df['raison'] =...
because df.groupby('id_sin')['raison'].apply(lambda x: ', '.join(x))
has length 2 with different index than df
, which has length 4.
Upvotes: 1