Reputation: 683
I have a nested dictionary d1
d1={'Hiraki': {'Hiraki_2': ['KANG_785','KANG_785','KANG_762']}, 'LakeTaupo': {'LakeTaupo_2': ['KANG_785', 'KANG_785', 'KANG_785', 'KANG_751']}}
I would like to remove the duplicate values for each key. The result after removing the duplicate values should be:
d1={'Hiraki': {'Hiraki_2': ['KANG_785','KANG_762']}, 'LakeTaupo': {'LakeTaupo_2': ['KANG_785', 'KANG_751']}}
I do not how to code it in python. Please help me.
Upvotes: 0
Views: 99
Reputation: 768
This will change the lists inplace
d1={'Hiraki': {'Hiraki_2': ['KANG_785','KANG_785','KANG_762']}, 'LakeTaupo': {'LakeTaupo_2': ['KANG_785', 'KANG_785', 'KANG_785', 'KANG_751']}}
# Deals with the tuples
def recurse_tuple(my_tup):
for i, v in enumerate(my_tup):
if isinstance(v, dict): my_tup[i] = recurse_dict(v)
return my_tup
# Deals with the dictionaries and lists
def recurse_dict(my_dict):
for k, v in my_dict.items():
if isinstance(v, dict): my_dict[k] = recurse_dict(v)
if isinstance(v, tuple): my_dict[k] = recurse_tuple(v)
if isinstance(v, list): my_dict[k] = set(v)
return my_dict
print(recurse_dict(d1))
#Output
{'Hiraki': {'Hiraki_2': {'KANG_762', 'KANG_785'}}, 'LakeTaupo': {'LakeTaupo_2': {'KANG_785', 'KANG_751'}}}
NOTE: @Samwise has beaten me to the punch with a very neat recursive function.
Upvotes: 1
Reputation: 953
You can use set() to eliminate duplicates.
d1={'Hiraki': {'Hiraki_2': ['KANG_785','KANG_785','KANG_762']}, 'LakeTaupo': {'LakeTaupo_2': ['KANG_785', 'KANG_785', 'KANG_785', 'KANG_751']}}
d2 ={key1: {key2: list(set(val2)) for key2, val2 in val1.items()} for key1, val1 in
d1.items()}
print(d2)
Output:
{'Hiraki': {'Hiraki_2': ['KANG_785', 'KANG_762']}, 'LakeTaupo': {'LakeTaupo_2': ['KANG_785', 'KANG_751']}}
Upvotes: 1
Reputation: 6780
Basically, if you want to remove duplicate values in a sequence
, you convert it to a set
then back again.
>>> data = ['KANG_785','KANG_785','KANG_762']
>>> data = list(set(data))
>>> data
['KANG_762', 'KANG_785']
Notice that this will not maintain ordering.
Also, consider carefully if you actually need a list
or not; a set
is still iterable after all, so if you want to maintain uniqueness at all time, consider storing the data as a set
and convert to list
only when necessary.
>>> data = ['KANG_785','KANG_785','KANG_762']
>>> data = set(data)
>>> data
{'KANG_762', 'KANG_785'}
>>> for i in data:
... print(i)
...
KANG_762
KANG_785
>>> type(data)
<class 'set'>
Upvotes: 1
Reputation: 7268
You can try:
d1={'Hiraki': {'Hiraki_2': ['KANG_785','KANG_785','KANG_762']}, 'LakeTaupo': {'LakeTaupo_2': ['KANG_785', 'KANG_785', 'KANG_785', 'KANG_751']}}
output = {}
for key, val in d1.items():
for key1, val1 in val.items():
output[key] = {
key1: list(set(val1))
}
print(output)
Output:
{'Hiraki': {'Hiraki_2': ['KANG_785', 'KANG_762']}, 'LakeTaupo': {'LakeTaupo_2': ['KANG_785', 'KANG_751']}}
Upvotes: 1
Reputation: 71454
You can use the same strategy as described in this answer:
Convert a mixed nested dictionary into a list
but for the case where isinstance(d, list)
, return list(set(d))
(which will remove duplicate entries) instead of d
.
E.g.:
def dedupe_lists(d: dict) -> dict:
if isinstance(d, list):
return list(set(d))
if isinstance(d, dict):
return {k: dedupe_lists(v) for k, v in d.items()}
return d
Upvotes: 2