Reputation: 1985
I have two dictionaries, and I need to find the difference between the two, which should give me both a key and a value.
I have searched and found some addons/packages like datadiff and dictdiff-master, but when I try to import them in Python 2.7, it says that no such modules are defined.
I used a set here:
first_dict = {}
second_dict = {}
value = set(second_dict) - set(first_dict)
print value
My output is:
>>> set(['SCD-3547', 'SCD-3456'])
I am getting only keys, and I need to also get the values.
Upvotes: 181
Views: 342806
Reputation: 19
This solution worked pretty well for me. It will return a new dictionary with any key/pair that has either a different value or if it's missing entirely from the other dictionary.
diff = {x: dict_one[x] for x in dict_one if x not in dict_two or x in dict_two and dict_one[x] != dict_two[x]}
Upvotes: -1
Reputation: 15632
Must go to any depth. Must show a "key trail" showing precisely where in the dict
when a difference is reported. Likely to need to be recursive. Must deal with any type of value at any depth. No need for an outside package.
def diff_dicts(expected_dict, found_dict, depth=0, key_trail=[]):
found_keys = list(found_dict.keys())
for key, expected_value in expected_dict.items():
key_trail.append(key)
if key in found_dict:
found_keys.remove(key)
found_value = found_dict[key]
if found_value != expected_value:
if isinstance(expected_value, dict) and isinstance(found_value, dict):
diff_dicts(expected_value, found_value, depth=depth+1, key_trail=copy.deepcopy(key_trail))
else:
print(f"""*** difference for key {key} depth {depth}:
expected of type {type(expected_value)}: {expected_value}
found of type {type(found_value)}: {found_value}
key trail: {key_trail}""")
else:
print(f'*** no key "{key}" in found_dict depth {depth}:\n{json.dumps(found_dict, indent=2)}\nkey trail: {key_trail}')
key_trail.pop()
for found_key in found_keys:
print(f'*** key "{found_key}" in found_dict, but absent from expected_dict depth {depth}:\n{json.dumps(expected_dict, indent=2)}\nkey trail: {key_trail}')
diff_dicts(expected_data, used_data)
NB Python dict
s can also contain list
s as values. The code for dealing with this (... elif isinstance(expected_value, list) and isinstance(found_value, list):
...) adds a lot more lines to handle things properly (a test must be made that the lists are the same length, that each element of both lists is a dict
, that diff_dicts
passes for each pair of elements, etc.) but is obvious.
Upvotes: 0
Reputation: 383
According to the documentation, Python now offers set operations directly on keys.
Keys views are set-like since their entries are unique and hashable. Items views also have set-like operations since the (key, value) pairs are unique and the keys are hashable. If all values in an items view are hashable as well, then the items view can interoperate with other sets. (Values views are not treated as set-like since the entries are generally not unique.) For set-like views, all of the operations defined for the abstract base class collections.abc.Set are available (for example, ==, <, or ^). While using set operators, set-like views accept any iterable as the other operand, unlike sets which only accept sets as the input.
So, this works and prints {'a'}
dict1 = {'a': 1, 'b': 2, 'c': 3}
dict2 = {'b': 4, 'c': 5, 'd': 6}
# Get the keys of both dictionaries as sets
keys1 = dict1.keys()
keys2 = dict2.keys()
# Perform set subtraction
unique_keys = keys1 - keys2
print(unique_keys)
Upvotes: 0
Reputation: 32388
On Python 3 I'm getting unhashable type: 'dict'
errors. I know OP asked for Python 2.7, but since it's already deprecated, here's Python 3 compatible function:
def dict_diff(a, b):
diff = {}
for k,v in a.items():
if k not in b:
diff[k] = v
elif v != b[k]:
diff[k] = '%s != %s' % (v, b[k])
for k,v in b.items():
if k not in a:
diff[k] = v
return diff
the output is following:
d1 = {1:'donkey', 2:'chicken', 3:'dog'}
d2 = {1:'donkey', 2:'chimpansee', 4:'chicken'}
diff = dict_diff(d1, d2)
# {2: 'chicken != chimpansee', 3: 'dog', 4: 'chicken'}
Upvotes: 0
Reputation: 1935
Another solution would be dictdiffer
(https://github.com/inveniosoftware/dictdiffer).
import dictdiffer
a_dict = {
'a': 'foo',
'b': 'bar',
'd': 'barfoo'
}
b_dict = {
'a': 'foo',
'b': 'BAR',
'c': 'foobar'
}
for diff in list(dictdiffer.diff(a_dict, b_dict)):
print(diff)
A diff is a tuple with the type of change, the changed value, and the path to the entry.
('change', 'b', ('bar', 'BAR'))
('add', '', [('c', 'foobar')])
('remove', '', [('d', 'barfoo')])
Upvotes: 94
Reputation: 691
For testing, the datatest package will check for differences in dictionaries, numpy arrays, pandas dataframes, etc. Datatest also lets you also set a tolerance for floating point comparisons.
from datatest import validate, accepted
def test_compare_dict():
expected = {"key1": 0.5}
actual = {"key1": 0.499}
with accepted.tolerance(0.1):
validate(expected, actual)
Differences result in a datatest.ValidationError
that contains the relevant Invalid, Deviation, Missing, or Extra items.
Upvotes: 0
Reputation: 80222
This solution works perfectly with unhashable dicts, which fixes this error:
TypeError: Unhashable type 'dict'.
Start with the top-ranked solution from @Roedy. We create a dictionary of lists, which are a good example of something that is non-hashable:
>>> dict1 = {1:['donkey'], 2:['chicken'], 3:['dog']}
>>> dict2 = {1:['donkey'], 2:['chimpansee'], 4:['chicken']}
Then we preprocess to make each value hashable using str(value)
:
>>> set1 = set([(key, str(value)) for key, value in dict1.items()])
>>> set2 = set([(key, str(value)) for key, value in dict2.items()])
Then we continue as per answer from @Reody:
>>> set1 ^ set2
{(3, "['dog']"), (4, "['chicken']"), (2, "['chimpansee']"), (2,"['chicken']")}
Upvotes: 1
Reputation: 53
a_dic={'a':1, 'b':2}
b_dic={'a':1, 'b':20}
sharedmLst = set(a_dic.items()).intersection(b_dic.items())
diff_from_b = set(a_dic.items()) - sharedmLst
diff_from_a = set(b_dic.items()) - sharedmLst
print("Among the items in a_dic, the item different from b_dic",diff_from_b)
print("Among the items in b_dic, the item different from a_dic",diff_from_a)
Result :
Among the items in a_dic, the item different from b_dic {('b', 2)}
Among the items in b_dic, the item different from a_dic {('b', 20)}
Upvotes: 0
Reputation: 77
For one side comparison you can use dict comprehension:
dict1 = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
dict2 = {'a': OMG, 'b': 2, 'c': 3, 'd': 4}
data = {a:dict1[a] for a in dict1 if dict1[a] != dict2[a]}
output: {'a': 1}
Upvotes: 1
Reputation: 1779
Here is a variation that lets you update dict1 values if you know the values in dict2 are right.
Consider:
dict1.update((k, dict2.get(k)) for k, v in dict1.items())
Upvotes: 0
Reputation: 14801
You can use DeepDiff:
pip install deepdiff
Among other things, it lets you recursively calculate the difference of dictionaries, iterables, strings and other objects:
>>> from deepdiff import DeepDiff
>>> d1 = {1:1, 2:2, 3:3, "foo":4}
>>> d2 = {1:1, 2:4, 3:3, "bar":5, 6:6}
>>> DeepDiff(d1, d2)
{'dictionary_item_added': [root['bar'], root[6]],
'dictionary_item_removed': [root['foo']],
'values_changed': {'root[2]': {'new_value': 4, 'old_value': 2}}}
It lets you see what changed (even types), what was added and what was removed. It also lets you do many other things like ignoring duplicates and ignoring paths (defined by regex).
Upvotes: 27
Reputation: 2963
I would recommend using something already written by good developers. Like pytest
. It has a deal with any data type, not only dicts. And, BTW, pytest
is very good at testing.
from _pytest.assertion.util import _compare_eq_any
print('\n'.join(_compare_eq_any({'a': 'b'}, {'aa': 'vv'}, verbose=3)))
Output is:
Left contains 1 more item:
{'a': 'b'}
Right contains 1 more item:
{'aa': 'vv'}
Full diff:
- {'aa': 'vv'}
? - ^^
+ {'a': 'b'}
? ^
If you don't like using private functions (started with _
), just have a look at the source code and copy/paste the function to your code.
P.S.: Tested with pytest==6.2.4
Upvotes: 17
Reputation: 753
This is my own version, from combining https://stackoverflow.com/a/67263119/919692 with https://stackoverflow.com/a/48544451/919692, and now I see it is quite similar to https://stackoverflow.com/a/47433207/919692:
def dict_diff(dict_a, dict_b, show_value_diff=True):
result = {}
result['added'] = {k: dict_b[k] for k in set(dict_b) - set(dict_a)}
result['removed'] = {k: dict_a[k] for k in set(dict_a) - set(dict_b)}
if show_value_diff:
common_keys = set(dict_a) & set(dict_b)
result['value_diffs'] = {
k:(dict_a[k], dict_b[k])
for k in common_keys
if dict_a[k] != dict_b[k]
}
return result
Upvotes: 10
Reputation: 1758
A solution is to use the unittest
module:
from unittest import TestCase
TestCase().assertDictEqual(expected_dict, actual_dict)
Obtained from How can you test that two dictionaries are equal with pytest in python
Upvotes: 24
Reputation: 12817
Not sure this is what the OP asked for, but this is what I was looking for when I came across this question - specifically, how to show key by key the difference between two dicts:
Pitfall: when one dict has a missing key, and the second has it with a None value, the function would assume they are similar
This is not optimized at all - suitable for small dicts
def diff_dicts(a, b, drop_similar=True):
res = a.copy()
for k in res:
if k not in b:
res[k] = (res[k], None)
for k in b:
if k in res:
res[k] = (res[k], b[k])
else:
res[k] = (None, b[k])
if drop_similar:
res = {k:v for k,v in res.items() if v[0] != v[1]}
return res
print(diff_dicts({'a': 1}, {}))
print(diff_dicts({'a': 1}, {'a': 2}))
print(diff_dicts({'a': 2}, {'a': 2}))
print(diff_dicts({'a': 2}, {'b': 2}))
print(diff_dicts({'a': 2}, {'a': 2, 'b': 1}))
Output:
{'a': (1, None)}
{'a': (1, 2)}
{}
{'a': (2, None), 'b': (None, 2)}
{'b': (None, 1)}
Upvotes: 8
Reputation: 11
This will return a new dict (only changed data).
def get_difference(obj_1: dict, obj_2: dict) -> dict:
result = {}
for key in obj_1.keys():
value = obj_1[key]
if isinstance(value, dict):
difference = get_difference(value, obj_2.get(key, {}))
if difference:
result[key] = difference
elif value != obj_2.get(key):
result[key] = obj_2.get(key, None)
return result
Upvotes: 1
Reputation: 4614
This function gives you all the diffs (and what stayed the same) based on the dictionary keys only. It also highlights some nice Dict comprehension, Set operations and python 3.6 type annotations :)
from typing import Dict, Any, Tuple
def get_dict_diffs(a: Dict[str, Any], b: Dict[str, Any]) -> Tuple[Dict[str, Any], Dict[str, Any], Dict[str, Any], Dict[str, Any]]:
added_to_b_dict: Dict[str, Any] = {k: b[k] for k in set(b) - set(a)}
removed_from_a_dict: Dict[str, Any] = {k: a[k] for k in set(a) - set(b)}
common_dict_a: Dict[str, Any] = {k: a[k] for k in set(a) & set(b)}
common_dict_b: Dict[str, Any] = {k: b[k] for k in set(a) & set(b)}
return added_to_b_dict, removed_from_a_dict, common_dict_a, common_dict_b
If you want to compare the dictionary values:
values_in_b_not_a_dict = {k : b[k] for k, _ in set(b.items()) - set(a.items())}
Upvotes: 8
Reputation: 41
Old question, but thought I'd share my solution anyway. Pretty simple.
dicta_set = set(dicta.items()) # creates a set of tuples (k/v pairs)
dictb_set = set(dictb.items())
setdiff = dictb_set.difference(dicta_set) # any set method you want for comparisons
for k, v in setdiff: # unpack the tuples for processing
print(f"k/v differences = {k}: {v}")
This code creates two sets of tuples representing the k/v pairs. It then uses a set method of your choosing to compare the tuples. Lastly, it unpacks the tuples (k/v pairs) for processing.
Upvotes: 3
Reputation: 3328
A function using the symmetric difference set operator, as mentioned in other answers, which preserves the origins of the values:
def diff_dicts(a, b, missing=KeyError):
"""
Find keys and values which differ from `a` to `b` as a dict.
If a value differs from `a` to `b` then the value in the returned dict will
be: `(a_value, b_value)`. If either is missing then the token from
`missing` will be used instead.
:param a: The from dict
:param b: The to dict
:param missing: A token used to indicate the dict did not include this key
:return: A dict of keys to tuples with the matching value from a and b
"""
return {
key: (a.get(key, missing), b.get(key, missing))
for key in dict(
set(a.items()) ^ set(b.items())
).keys()
}
print(diff_dicts({'a': 1, 'b': 1}, {'b': 2, 'c': 2}))
# {'c': (<class 'KeyError'>, 2), 'a': (1, <class 'KeyError'>), 'b': (1, 2)}
We use the symmetric difference set operator on the tuples generated from taking items. This generates a set of distinct (key, value)
tuples from the two dicts.
We then make a new dict from that to collapse the keys together and iterate over these. These are the only keys that have changed from one dict to the next.
We then compose a new dict using these keys with a tuple of the values from each dict substituting in our missing token when the key isn't present.
Upvotes: 8
Reputation: 2375
I think it's better to use the symmetric difference operation of sets to do that Here is the link to the doc.
>>> dict1 = {1:'donkey', 2:'chicken', 3:'dog'}
>>> dict2 = {1:'donkey', 2:'chimpansee', 4:'chicken'}
>>> set1 = set(dict1.items())
>>> set2 = set(dict2.items())
>>> set1 ^ set2
{(2, 'chimpansee'), (4, 'chicken'), (2, 'chicken'), (3, 'dog')}
It is symmetric because:
>>> set2 ^ set1
{(2, 'chimpansee'), (4, 'chicken'), (2, 'chicken'), (3, 'dog')}
This is not the case when using the difference operator.
>>> set1 - set2
{(2, 'chicken'), (3, 'dog')}
>>> set2 - set1
{(2, 'chimpansee'), (4, 'chicken')}
However it may not be a good idea to convert the resulting set to a dictionary because you may lose information:
>>> dict(set1 ^ set2)
{2: 'chicken', 3: 'dog', 4: 'chicken'}
Upvotes: 235
Reputation: 106
def flatten_it(d):
if isinstance(d, list) or isinstance(d, tuple):
return tuple([flatten_it(item) for item in d])
elif isinstance(d, dict):
return tuple([(flatten_it(k), flatten_it(v)) for k, v in sorted(d.items())])
else:
return d
dict1 = {'a': 1, 'b': 2, 'c': 3}
dict2 = {'a': 1, 'b': 1}
print set(flatten_it(dict1)) - set(flatten_it(dict2)) # set([('b', 2), ('c', 3)])
# or
print set(flatten_it(dict2)) - set(flatten_it(dict1)) # set([('b', 1)])
Upvotes: 5
Reputation: 81
What about this? Not as pretty but explicit.
orig_dict = {'a' : 1, 'b' : 2}
new_dict = {'a' : 2, 'v' : 'hello', 'b' : 2}
updates = {}
for k2, v2 in new_dict.items():
if k2 in orig_dict:
if v2 != orig_dict[k2]:
updates.update({k2 : v2})
else:
updates.update({k2 : v2})
#test it
#value of 'a' was changed
#'v' is a completely new entry
assert all(k in updates for k in ['a', 'v'])
Upvotes: 5
Reputation: 406
You were right to look at using a set, we just need to dig in a little deeper to get your method to work.
First, the example code:
test_1 = {"foo": "bar", "FOO": "BAR"}
test_2 = {"foo": "bar", "f00": "b@r"}
We can see right now that both dictionaries contain a similar key/value pair:
{"foo": "bar", ...}
Each dictionary also contains a completely different key value pair. But how do we detect the difference? Dictionaries don't support that. Instead, you'll want to use a set.
Here is how to turn each dictionary into a set we can use:
set_1 = set(test_1.items())
set_2 = set(test_2.items())
This returns a set containing a series of tuples. Each tuple represents one key/value pair from your dictionary.
Now, to find the difference between set_1 and set_2:
print set_1 - set_2
>>> {('FOO', 'BAR')}
Want a dictionary back? Easy, just:
dict(set_1 - set_2)
>>> {'FOO': 'BAR'}
Upvotes: 13
Reputation: 236004
Try the following snippet, using a dictionary comprehension:
value = { k : second_dict[k] for k in set(second_dict) - set(first_dict) }
In the above code we find the difference of the keys and then rebuild a dict
taking the corresponding values.
Upvotes: 120