Reputation: 85
I want to get distinct values from tuple list. Let's say we have following tuple list generated from,
df_clustering_dupliacteRemove.groupby('cluster').agg(tuple).sum(1).map(natsorted).map(tuple)
Output
604 (GM0051, GM0178, GM0191)
605 (GM0134, GM0267, GM0351, GM0615)
606 (GM0180, GM0474, GM0512)
607 (GM0216, GM0471, GM0586)
608 (GM0373, GM0373, GM0373)
If you look at the 608 th tupple the elements are the same. I want to make those elements one (get unique/distinct values).
Sample Output I need
604 (GM0051, GM0178, GM0191)
605 (GM0134, GM0267, GM0351, GM0615)
606 (GM0180, GM0474, GM0512)
607 (GM0216, GM0471, GM0586)
608 (GM0373)
Upvotes: 2
Views: 872
Reputation: 1280
You can simply do set of tuple to get unique values from each tuple.
In [1]: data = [('GM0051', 'GM0178', 'GM0191'),('GM0134', 'GM0267', 'GM0351', 'GM0615'),('GM0180', 'GM0474', 'GM0512'),('GM0216', 'GM0471', 'GM0586'),('GM0373', 'GM0373', 'GM0373')]
In [2]: [tuple(set(d)) for d in data]
Out[2]:
[('GM0051', 'GM0191', 'GM0178'),
('GM0267', 'GM0134', 'GM0351', 'GM0615'),
('GM0474', 'GM0512', 'GM0180'),
('GM0216', 'GM0471', 'GM0586'),
('GM0373',)]
Upvotes: 0
Reputation: 331
This will help you, I used sets in order to simplify a tuple :)
def simplify_tuple(t):
return tuple({i for i in t})
tuple_ = (1, 1, 1, 2, 3)
print(simplify_tuple(tuple_))
# (1, 2, 3)
Upvotes: 0
Reputation: 113
This worked for me.
> myDf
COL
0 (a, b, c)
1 (d, e, f)
2 (a, a)
3 a
4 (c, c, c)
> result = pd.DataFrame(myDf.apply( lambda x : tuple(set(x[0])), axis=1))
> result
0
0 (a, b, c)
1 (f, e, d)
2 (a,)
3 (a,)
4 (c,)
Upvotes: 1