Reputation: 75
The following are the details
Here is the data frame
Name| Filename| delimetier| good delimeter| bad delimeter
A 123 48 a A
A 123 48 A
B 123 48 b C
C 123 49 c B
A 123 48 d D
A 123 48 c E
B 123 48 d F
What I want is
Name| Filename| delimetier| good delimeter| bad delimeter
A 123 48 a, c, d A, D, E
B 123 48 b, d C, F
C 123 49 c B
Even there have null value and duplicates, ignore them. And I have tried use groupby() to solve it, but failed.
Upvotes: 1
Views: 38
Reputation: 93191
groupby
is the right approach. You only need to define a custom aggregate function:
str_concat = lambda s: ", ".join(s.drop_duplicates().dropna().sort_values())
df.groupby(["Name", "Filename", "delimetier"]).agg(str_concat)
Upvotes: 1