Reputation: 3126
What is the difference between using the intersection()
method or the &
operator on python sets. I read about how in previous versions the arguments to &
had to be a set and not just any iterable although that seems to be no longer the case.
Is there a difference in terms of semantics, constraints, performance or simply pythonic style?
Upvotes: 2
Views: 561
Reputation: 584
Here are some timings in Python 3 done on 3.7 GHz CPU...
intersection
is the only one that I would say has negligible difference in performance, but the other operations seem faster where the "non-operator" version is using the flexibility to allow any iterable
as an argument (not just an explicit set
).
It seems that explicitly creating a set
may (obviously or not) be a performance impact, if the decision is otherwise arbitrary, to choose between converting an existing non-set iterable
into a set
just to use the operator version.
import random
nums = random.choices(range(1, 10000), k=10000) #this is a list
all_nums = set(range(1, 10000))
|
%timeit all_nums.union(nums)
248 µs ± 4.44 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit all_nums | set(nums)
409 µs ± 5.25 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
-
%timeit all_nums.difference(nums)
387 µs ± 2.35 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit all_nums - set(nums)
451 µs ± 1.59 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
&
%timeit all_nums.intersection(nums)
477 µs ± 4.57 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit all_nums & set(nums)
479 µs ± 2.23 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
^
%timeit all_nums.symmetric_difference(nums)
421 µs ± 840 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit all_nums ^ set(nums)
557 µs ± 1.98 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
I should add that the performance impact shown above is due to explicitly creating the set. If the argument is already a set, @kindall has the right answer (operator version may be faster).
Upvotes: 0
Reputation: 33
The methods like intersection()
will accept any iterable, wheras the operators will only accept set types.
The info is below the method description in the docs:
Note, the non-operator versions of union(), intersection(), difference(), and symmetric_difference(), issubset(), and issuperset() methods will accept any iterable as an argument. In contrast, their operator based counterparts require their arguments to be sets.
Upvotes: 1
Reputation: 184345
There is no difference in functionality, although using the operators is a little faster because Python special-cases access to these methods. The performance difference in most programs is not so great as to demand that the operators be used.
Upvotes: 2
Reputation: 799210
The methods can be bound to names for later use, whereas the operators can be replaced by the operations in the operator
module for the purpose of larger abstraction.
Upvotes: 3