Reputation: 19
how to get the unique value of a column pandas that contains list or value ? my column:
column | column
test | [A,B]
test | [A,C]
test | C
test | D
test | [E,B]
i want list like that :
list = [A, B, C, D, E]
thank you
Upvotes: 1
Views: 89
Reputation: 294278
You can use a flattening function Credit @wim
import collections
def flatten(l):
for i in l:
if isinstance(i, collections.abc.Iterable) and not isinstance(i, str):
yield from flatten(i)
else:
yield i
Then use set
list(set(flatten(df.B)))
['A', 'B', 'E', 'C', 'D']
df = pd.DataFrame(dict(
B=[['A', 'B'], ['A', 'C'], 'C', 'D', ['E', 'B']]
))
Upvotes: 1
Reputation: 59549
You can apply pd.Series
to split up the lists, then stack
and unique
.
import pandas as pd
df = pd.DataFrame({'col': [['A', 'B'], ['A', 'C'], 'C', 'D', ['E', 'B']]})
df.col.apply(pd.Series).stack().unique().tolist()
Outputs
['A', 'B', 'C', 'D', 'E']
Upvotes: 1