Reputation: 18680
Ok, I understand in languages like C++ why calling virtual method defined in a class is slower than calling a non-virtual method (you have to go through the dynamic dispatch table to lookup the correct implementation to call).
But in Python, if I have:
list_of_sets = generate_a_list_containg_a_bunch_of_sets()
intersection_of_all = reduce(list_of_sets[0].intersection, list_of_sets)
This is dramatically (in my experiments about 40%) slower than:
list_of_sets = generate_a_list_containg_a_bunch_of_sets()
intersection_of_all = reduce(set.intersection, list_of_sets)
What I don't get is why that should be so much slower, the method lookup (I would think) would happen on the call to reduce, so the inside of reduce where the intersection method is actually called shouldn't have to be looked up again (it just just reuse the same method reference).
Could someone illuminate where my understanding is flawed?
Upvotes: 1
Views: 192
Reputation: 601599
This is completely unrelated to method binding etc. The first version computes the intersection of three sets in each iteration, while the second version only intersects two sets. This is easy to see if we use the explicit loops instead.
Variant 1:
intersection = list_of_sets[0]
for s in list_of_sets[1:]:
intersection = list_of_sets[0].intersection(intersection, s)
Variant 2:
intersection = list_of_sets[0]
for s in list_of_sets[1:]:
intersection = set.intersection(intersection, s)
(Would you agree now Guido has a point?)
Note that this will probably be even faster:
intersection = list_of_sets[0]
for s in list_of_sets[1:]:
intersection.intersection_update(s)
Upvotes: 12