Steven Wink
Steven Wink

Reputation: 137

pandas dataframe with comma separated string entries, change to unique comma separated entries

I have a pandas dataframe as such:

import pandas as pd
data = [["a,a,a", "b,b", "c,c,c"], ["d,d","e","fd"],["g,h,i", "g", "fg,h,a"]]
df = pd.DataFrame(data, columns = ["ColA","ColB","ColC"])

df

    ColA    ColB    ColC
0   a,a,a   b,b     c,c,c
1   d,d     e       fd
2   g,h,i   g       fg,h,a

I would like to reformat this table to:

    colA    colB    colC  
0   a       b       c
1   d       e       fd
2   g,h,i   g       fg,h,a

So unique entries after string splitting each entry by comma separated value.

Upvotes: 1

Views: 743

Answers (1)

BStadlbauer
BStadlbauer

Reputation: 1285

df.applymap(lambda elements: ','.join(set(elements.split(','))))

applymap() applies a function to all elements (cells) of a dataframe. The lambda function here first splits the data by ,, then creates a set of all elements and concatenates them back with strings .join() method.

Upvotes: 2

Related Questions