Reputation: 5780
This is sort of a ridiculous and weird use case but bear with me, I have this list comprehension:
"reading_types": [
{
"name": rt.reading_type,
"value": rt.reading_type_id,
}
for unit in item.units
for rt in unit.reading_types
],
in a backend api call. It works great except that there will almost always be duplicates in the end result. How can I ensure that no duplicates are returned?
This is actually happening inside another list comprehension, and I can't reference the list at any point to remove duplicates so I must do so within the list comprehension itself.
I've tried using a set
:
set([
{
"name": rt.reading_type,
"value": rt.reading_type_id,
}
for unit in item.units
for rt in unit.reading_types
])
but this results in the error: unhashable type: dict
Upvotes: 3
Views: 134
Reputation: 140307
the idea is to make your structures hashable without destroying them too much so you can restore them back as how they were.
You could convert your dictionaries to dict_items
then to tuples
(now we can put that in a set
because data is hashable), apply a set
on that, and convert back to dictionary:
input_list = [{"name":"name1","id":"id1"},{"name":"name2","id":"id2"},
{"name":"name1","id":"id1"}]
output_list = [dict(items) for items in {tuple(a.items()) for a in input_list}]
This works because values of the sub-dicts are hashable (strings). If they were dictionaries, we'd have to convert them too.
result:
[{'id': 'id1', 'name': 'name1'}, {'id': 'id2', 'name': 'name2'}]
another solution (by Jon Clements) that doesn't use a set
but builds a dictionary (using a dictionary comprehension) & uses key unicity to clobber duplicates, then extract only values:
list({tuple(d.items()):d for d in input_list}.values())
Upvotes: 6
Reputation: 61063
You can use a namedtuple
instead of a dictionary inside the set
. As immutable objects, namedtuple
s are hashable, which dictionaries are not. You can also use a set comprehension directly:
from collections import namedtuple
reading_type = namedtuple("reading_type", ["name", "value"])
{reading_type(rt.reading_type, rt.reading_type_id)
for unit in item.units
for rt in unit.reading_types}
Upvotes: 2
Reputation: 164843
This isn't a list comprehension, but you can use the itertools
unique_everseen
recipe, also available in 3rd party libraries, e.g. more_itertools.unique_everseen
:
from more_itertools import unique_everseen
input_list = [{"name":"name1","id":"id1"},{"name":"name2","id":"id2"},
{"name":"name1","id":"id1"}]
res = list(unique_everseen(input_list, key=lambda d: tuple(sorted(d.items()))))
print(res)
[{'name': 'name1', 'id': 'id1'}, {'name': 'name2', 'id': 'id2'}]
The trick is to make sure you can hash your dictionaries, which we perform by converting each dictionary to a tuple of sorted tuples. Internally, the algorithm works by maintaining a "seen" set
of values and yielding only values which do not appear in the set
, adding them otherwise.
Upvotes: 0
Reputation: 2843
You can wrap your entire list in another comprehension to repr
each entry, and use set
on that:
set([repr(val) for val in [...]])
Upvotes: -1