Reputation: 129
I have a nested dictionary containing parents (keys) and their children (values). I want to remove parents and their children if the parent is a child of another parent in the tree, i.e. I want to delete a key if it appears elsewhere in the dictionary as a value. Here is example input/output:
Input:
{
"Animal": {
"Cat": [],
"Dog": {
"Labrador": {
"LabradorPup": []
}
}
},
"DieselCar": {
"Hyundai": []
},
"Dog": {
"Labrador": {
"LabradorPup": []
}
},
"ElectricCar": {
"Tesla": []
},
"Labrador": {
"LabradorPup": []
},
"PetrolCar": {
"Ford": [],
"Hyundai": []
},
"Vehicle": {
"DieselCar": {
"Hyundai": []
},
"ElectricCar": {
"Tesla": []
},
"PetrolCar": {
"Ford": [],
"Hyundai": []
}
}
}
Desired output:
{
"Animal": {
"Cat": [],
"Dog": {
"Labrador": {
"LabradorPup": []
}
}
},
"Vehicle": {
"DieselCar": {
"Hyundai": []
},
"ElectricCar": {
"Tesla": []
},
"PetrolCar": {
"Ford": [],
"Hyundai": []
}
}
}
I have the following code which keeps the parents who have children, however this doesn't produce the output I am looking for:
inheritance_tree = {parent:children for parent, children in inheritance_tree.items() if any(child for child in children.values())}
You can see that the "Dog"
key is not removed even though it is a child of "Animal"
:
{
"Animal": {
"Cat": [],
"Dog": {
"Labrador": {
"LabradorPup": []
}
}
},
"Dog": {
"Labrador": {
"LabradorPup": []
}
},
"Vehicle": {
"DieselCar": {
"Hyundai": []
},
"ElectricCar": {
"Tesla": []
},
"PetrolCar": {
"Ford": [],
"Hyundai": []
}
}
}
Upvotes: 1
Views: 791
Reputation: 76184
I don't think any(child for child in children.values())
is an effective way of determining whether children
should stay in the final dict. That expression is basically equivalent to "does this dict have at least one value that isn't an empty string?". The Dog dict has a non-empty child, so it remains in your final dict.
Here's the approach I would use. Write a function that recursively iterates over a nested data structure and yields all of its keys, no matter how deeply they are nested. Run this function on every top-level key-value pair to identify the names of all child values. Then create a new dict that excludes those names from the top level.
def iter_all_keys(obj):
if not isinstance(obj, dict):
return
for key, value in obj.items():
yield key
for x in iter_all_keys(value):
yield x
d = {
"Animal": {
"Cat": [],
"Dog": {
"Labrador": {
"LabradorPup": []
}
}
},
"DieselCar": {
"Hyundai": []
},
"Dog": {
"Labrador": {
"LabradorPup": []
}
},
"ElectricCar": {
"Tesla": []
},
"Labrador": {
"LabradorPup": []
},
"PetrolCar": {
"Ford": [],
"Hyundai": []
},
"Vehicle": {
"DieselCar": {
"Hyundai": []
},
"ElectricCar": {
"Tesla": []
},
"PetrolCar": {
"Ford": [],
"Hyundai": []
}
}
}
child_names = {child_name for toplevel_name, toplevel_children in d.items() for child_name in iter_all_keys(toplevel_children)}
d = {key: value for key, value in d.items() if key not in child_names}
print(d)
Result (whitespace added by me for clarity):
{
'Animal': {
'Dog': {
'Labrador': {
'LabradorPup': []
}
},
'Cat': []
},
'Vehicle': {
'DieselCar': {
'Hyundai': []
},
'PetrolCar': {
'Hyundai': [],
'Ford': []
},
'ElectricCar': {
'Tesla': []
}
}
}
Note that this only removes duplicates from the top level. If you were to run this code on a dictionary such as this one:
d = {
"Human":{
"Fred": [],
"Barney": []
},
"Caveman":{
"Fred": [],
"Barney": []
}
}
... Then the resulting dict would be identical to the input. Fred and Barney both appear twice in the data structure. If this is not the desired result, it's not clear what the result should be. Should Fred and Barney be removed from Human, or from Caveman? If the logic should be "keep Fred and Barney in Human, because that's the one we encountered first. Get rid of the rest", then the result will not be deterministic, because dictionaries in 2.7 are not guaranteed to be ordered.
Upvotes: 1
Reputation: 2159
Try This:
I Know its Complicated.
aa = [i for i,j in a.items()]
bb = [get_all_keys(j) for i,j in a.items()]
for i in aa:
for j in bb:
if i in j:
for k in a:
if k==i:
del a[k]
Tell me you are getting right or wrong.
Upvotes: 0
Reputation: 26
inheritance_tree = {
parent:children for parent, children in inheritance_tree.items() if any(
child for child in children.values()
)
}
Any checks if the children has childrens of its own. So what your current code does is keep only the parents who have grand-children. If you wish to remove those children from the list, you can write a function that goes through the list , and modifies a copy of it.
If you wish to stick to a one liner, you need to look for the parent in the values of the inheritance tree. However those values can be different from a dict, so you need to check for that as well.
y= {parent:children for parent, children in x.items() if all(
[(parent not in set(k.keys())) for k in x.values() if k])
}
Upvotes: 1