Reputation: 101
I have function find_schema_differences(a, b)
, which compares two nested dictionaries and returns the difference.
def find_schema_differences(master_schema, client_schema):
differences = []
for x in master_schema:
if type(master_schema[x]) is not dict:
if not x in client_schema or master_schema[x] != client_schema[x]:
differences.append({x: master_schema[x]})
else:
if x not in client_schema or not find_schema_differences(
master_schema[x], client_schema[x]
):
differences.append({x: master_schema[x]})
return differences
But this resets the 'differences' variable to empty list on every iteration.
I tried this the following but it doesn't look good:
def find_schema_differences(master_schema, client_schema):
differences = []
def find_difference(master_schema, client_schema):
for x in master_schema:
if type(master_schema[x]) is not dict:
if not x in client_schema or master_schema[x] != client_schema[x]:
differences.append({x: master_schema[x]})
else:
if x not in client_schema or not find_difference(
master_schema[x], client_schema[x]
):
differences.append({x: master_schema[x]})
return differences
find_difference(master_schema, client_schema)
return differences
So, is there a way I can improve on this?
Upvotes: 0
Views: 93
Reputation: 31354
I think this is what you were after:
def find_schema_differences(d1, d2):
return dict(
(k, v) if not isinstance(d1[k], dict) else (k, find_schema_differences(d1[k], d2[k]))
for k, v in d1.items() if k not in d2 or d1[k] != d2[k]
)
a = {'x': 1, 'y': 2, 'z': {'a': 1, 'b': 2, 'c': 3}}
b = {'x': 1, 'y': 3, 'z': {'a': 1, 'b': 3}}
differences = find_schema_differences(a, b)
print(differences)
Result:
{'y': 2, 'z': {'b': 2, 'c': 3}}
As for the answer to your question, what you did is not a bad way to create a variable that remains accessible over several calls, although you should generally consider returning the local result and constructing the overall result from the returned values in a recursive solution.
You indicated in comments that you found the compact Python syntax hard to read - I don't blame you, it takes some getting used to. This code does the same, and I assume you might find it a bit easier on the eyes, though it may not perform as well:
def find_schema_differences(d1, d2):
result = {}
for k, v in d1.items():
if k not in d2 or d1[k] != d2[k]:
if not isinstance(d1[k], dict):
result[k] = v
else:
result[k] = find_schema_differences(d1[k], d2[k])
return result
Upvotes: 1
Reputation: 635
You could pass the list of differences to the function itself:
def find_schema_differences(master_schema, client_schema, differences=[]):
for x in master_schema:
if type(master_schema[x]) is not dict:
if not x in client_schema or master_schema[x] != client_schema[x]:
differences.append({x: master_schema[x]})
else:
if x not in client_schema or not find_schema_differences(
master_schema[x], client_schema[x]
):
differences.append({x: master_schema[x]})
return differences
differences = []
differences = find_schema_differences(master_schema, client_schema, differences=differences)
Like this, your differences would always append to the differences list. This avoids making this list a global variable, which would also be possible, but is not recommended.
Upvotes: 1
Reputation: 169338
I'd reformulate this as a recursive generator function:
def find_schema_differences(master_schema, client_schema):
for master_key, master_value in master_schema.items():
if master_key not in client_schema:
yield (master_key, master_value)
elif not isinstance(master_value, dict):
if client_schema[master_key] != master_value:
yield (master_key, master_value)
else:
found_deep_difference = False
for diff in find_schema_differences(master_value, client_schema[master_key]):
found_deep_difference = True
yield diff
if found_deep_difference:
yield (master_key, master_value)
The found_deep_difference
logic may not be what you expect; please adjust according to your requirements...
Upvotes: 1