crocefisso
crocefisso

Reputation: 903

How to remove duplicate strings after grouping?

I would like to group the following Pandas DataFrame by ID column:


|----+----------------------------------------+-----------------|
| ID | Elements                               | Colors          |
|----+----------------------------------------+-----------------|
| A  | '1st element, 2d element, 3d element'  | 'red, blue'     |
| A  | '2d element, 4th element'              | 'blue, green'   |
| B  | '3d element, 5th element, 6th element' | 'white, purple' |
| B  | '3d element, 5th element, 7th element' | 'white, teal'   |
| B  | '3d element, 5th element, 8th element' | 'white, black'  |
|----+----------------------------------------+-----------------|

In order to obtain the following Pandas DataFrame:

|----+-----------------------------------------------------------------+------------------------------|
| ID | Elements                                                        | Colors                       |
|----+-----------------------------------------------------------------+------------------------------|
| A  | '1st element, 2d element, 3d element, 4th element'              | 'red, blue, green'           |
| B  | '3d element, 5h element, 6th element, 7th element, 8th element' | 'white, purple, teal, black' |
|----+-----------------------------------------------------------------+------------------------------|

Upvotes: 0

Views: 57

Answers (1)

hlidka
hlidka

Reputation: 2345

import pandas as pd
from itertools import chain

data = {'ID': ['A', 'A', 'B', 'B', 'B'],
'Elements': ['1st element, 2d element, 3d element', '2d element, 4th element', '3d element, 5th element, 6th element', '3d element, 5th element, 7th element', '3d element, 5th element, 8th element'],
'Colors': ['red, blue', 'blue, green', 'white, purple', 'white, teal', 'white, black']}

df = pd.DataFrame(data)

def comma_split(s):
  return s.split(", ")

def uniques(lists):
  return ", ".join(set(chain(*map(comma_split, lists))))

df.groupby('ID').aggregate(uniques)

                                             Elements                      Colors
ID                                                                               
A    1st element, 4th element, 3d element, 2d element            green, red, blue
B   5th element, 8th element, 3d element, 7th elem...  purple, white, black, teal

Upvotes: 1

Related Questions