Reputation: 2257
Up until now I have been using this code to uniquify (remove duplicates) from list in python:
my_list = list(set(my_list))
I now have a list of lists, I want to be able to remove duplicates from within the list of lists. For example:
(['possible-duplicate', 'random-data'], ['possible-duplicate', 'random-data'], ['possible-duplicate', 'random-data'])
I want to remove the whole sublist if possible-duplicate
is a duplicate.
Can this be done?
Thanks
Upvotes: 1
Views: 159
Reputation: 82949
Make a dictionary from your data:
data = (['possible-duplicate', '12345'],
['not-a-duplicate', '54321'],
['possible-duplicate', '51423'])
data_unique = dict(data)
Result is {'not-a-duplicate': '54321', 'possible-duplicate': '51423'}
, or if you prefer a list of tuples, use date_unique.items()
, which gives you [('not-a-duplicate', '54321'), ('possible-duplicate', '51423')]
.
Or for the more general case, where the sublists have more than two elements, you can use this
data_unique = dict((d[0], d) for d in data)
and then use data_unique.values()
to reclaim the "uniquified" list.
Upvotes: 2
Reputation: 25974
seen = set()
[sublist for sublist in my_list if sublist[0] not in seen and not seen.add(sublist[0])]
This happens to preserve order as well, which list(set(...))
does not.
Upvotes: 4