Reputation: 23
How can I get 'set' data from DataFrame elements?
For example, if I have a data as
df = DataFrame([['a','b','z'], ['a', 'c'], ['d']])
I would like to get
{'a', 'b', 'c', 'd', 'z'}
(type: set)
If I use for sentence, I can code it. However, if there is another way to calculate with pandas, I would like to use it.
Upvotes: 2
Views: 178
Reputation: 323236
Try this:
A=[]
[A.extend(item)for item in df.values.tolist() ]
A = list(set([i for i in A if i is not None]))
A
Out[1224]: ['a', 'b', 'c', 'd', 'z']
Upvotes: 0
Reputation: 109546
s = set(df.values.ravel())
>>> s
{None, 'a', 'b', 'c', 'd', 'z'}
Technically, the value None
should be in the result. You could always remove it with s.remove(None)
if required.
Upvotes: 2
Reputation: 210842
DataFrame.stack(dropna=True) per default drops all NaNs
In [56]: df.stack().tolist()
Out[56]: ['a', 'b', 'z', 'a', 'c', 'd']
or as a set:
In [57]: set(df.stack().tolist())
Out[57]: {'a', 'b', 'c', 'd', 'z'}
Upvotes: 2