ahajib
ahajib

Reputation: 13510

Compare two large dictionaries and create lists of values for keys they have in common

I have a two dictionaries like:

dict1 = { (1,2) : 2, (2,3): 3, (1,3): 3}
dict2 = { (1,2) : 1, (1,3): 2}

What I want as output is two list of values for the items which exist in both dictionaries:

[2,3]
[1,2]

What I am doing right now is something like this:

list1 = []
list2 = []

for key in dict1.keys():
    if key in dict2.keys():
        list1.append(dict1.get(key))
        list2.append(dict2.get(key))

This code is taking too long running which is not something that I am looking forward to. I was wondering if there might be a more efficient way of doing it?

Upvotes: 20

Views: 3063

Answers (4)

bgusach
bgusach

Reputation: 15143

This should be done with keys in python3 and viewkeys in python2. These are view objects that behave like sets, and cost no extra effort to construct them... they are just "views" of the underlying dictionary keys. This way you save the construction of set objects.

common = dict1.viewkeys() & dict2.viewkeys()
list1 = [dict1[k] for k in common]
list2 = [dict2[k] for k in common]

dict_views objects can be intersected directly with dictionaries, thus the following code works as well. I would prefer the previous sample though.

common = dict1.viewkeys() & dict2

Upvotes: 0

Kasravnd
Kasravnd

Reputation: 107287

You can use a list comprehension within zip() function:

>>> vals1, vals2 = zip(*[(dict1[k], v) for k, v in dict2.items() if k in dict1])
>>> 
>>> vals1
(2, 3)
>>> vals2
(1, 2)

Or as a more functional approach using view object and operator.itemgetter() you can do:

>>> from operator import itemgetter
>>> intersect = dict1.viewkeys() & dict2.viewkeys()
>>> itemgetter(*intersect)(dict1)
(2, 3)
>>> itemgetter(*intersect)(dict2)
(1, 2)

Benchmark with accepted answer:

from timeit import timeit


inp1 = """
commons = set(dict1).intersection(set(dict2))
list1 = [dict1[k] for k in commons]
list2 = [dict2[k] for k in commons]
   """

inp2 = """
zip(*[(dict1[k], v) for k, v in dict2.items() if k in dict1])
   """
inp3 = """
intersect = dict1.viewkeys() & dict2.viewkeys()
itemgetter(*intersect)(dict1)
itemgetter(*intersect)(dict2)
"""
dict1 = {(1, 2): 2, (2, 3): 3, (1, 3): 3}
dict2 = {(1, 2): 1, (1, 3): 2}
print 'inp1 ->', timeit(stmt=inp1,
                        number=1000000,
                        setup="dict1 = {}; dict2 = {}".format(dict1, dict2))
print 'inp2 ->', timeit(stmt=inp2,
                        number=1000000,
                        setup="dict1 = {}; dict2 = {}".format(dict1, dict2))
print 'inp3 ->', timeit(stmt=inp3,
                        number=1000000,
                        setup="dict1 = {}; dict2 = {};from operator import itemgetter".format(dict1, dict2))

Output:

inp1 -> 0.000132083892822
inp2 -> 0.000128984451294
inp3 -> 0.000160932540894

For dictionaries with length 10000 and random generated items, in 100 loop with:

inp1 -> 1.18336105347
inp2 -> 1.00519990921
inp3 -> 1.52266311646

Edit:

As @Davidmh mentioned in comment for refusing of raising an exception for second approach you can wrap the code in a try-except expression:

try:
    intersect = dict1.viewkeys() & dict2.viewkeys()
    vals1 = itemgetter(*intersect)(dict1)
    vals2 = itemgetter(*intersect)(dict2)
except TypeError:
    vals1 = vals2 = []

Upvotes: 4

mgilson
mgilson

Reputation: 309841

Don't use dict.keys. On python2.x, it creates a new list every time it is called (which is an O(N) operation -- And list.__contains__ is another O(N) operation on average). Just rely on the fact that dictionaries are iterable containers directly (with O(1) lookup):

list1 = []
list2 = []

for key in dict1:
    if key in dict2:
        list1.append(dict1.get(key))
        list2.append(dict2.get(key))

Note that on python2.7, you can use viewkeys to get the intersection directly:

>>> a = {'foo': 'bar', 'baz': 'qux'}
>>> b = {'foo': 'bar'}
>>> a.viewkeys() & b
set(['foo'])

(on python3.x, you can use keys here instead of viewkeys)

for key in dict1.viewkeys() & dict2:
    list1.append(dict1[key]))
    list2.append(dict2[key]))

Upvotes: 14

BlackBear
BlackBear

Reputation: 22979

commons = set(dict1).intersection(set(dict2))
list1 = [dict1[k] for k in commons]
list2 = [dict2[k] for k in commons]

Upvotes: 28

Related Questions