Reputation: 13645
I'm used to seeing if obj is None:
in Python, and I've recently come across if obj is ():
. Since tuples are not mutable, it sounds like a reasonable internal optimization in the Python interpreter to have the empty tuple be a singleton, therefore allowing the use of is
rather than requiring ==
. But is this guaranteed somewhere? Since which version of the interpreter?
[edit] the question matters because if () is not a singleton and there is a way of producing an empty tuple with a different address, then using is {}
is a bug. If it is only guaranteed since Python 2.x with x > 0, then it is important to know the value of x if you need to ensure backward compatibility of your code. It is also important to know if this can break your code when using pypy / jython / ironpython...
Upvotes: 10
Views: 1141
Reputation: 4926
It's not about optimization. It's about objects comparisons. Python "is" is used to test object identity, then compare empty tuple "()" is not required to use "==" operator. In fact, anything in python can be compared with "is".
>>> obj = ()
>>> obj is ()
True
>>> isinstance(obj, tuple)
True
>>> obj is tuple
False
>>> type(obj) is tuple
True
>>> type(())
<type 'tuple'>
>>> type(tuple)
<type 'type'>
>>> tuple == type(()) # value comparison with ==
True
Same for any other value:
>>> 333 is int
False
>>> type(333) is int
True
>>> isinstance(333, int)
True
Upvotes: -1
Reputation: 45552
From the Python 2 docs and Python 3 docs:
... two occurrences of the empty tuple may or may not yield the same object.
In other words, you can't count on () is ()
to evaluate as true.
Upvotes: 13
Reputation: 34354
This is a non-guaranteed implementation detail of current versions of CPython, so you won't necessarily be able to rely on it in other Python implementations, including Jython, IronPython, PyPy, and potentially future versions of CPython.
Using is
appears to be about 0.04 μs faster on my system when comparing against a big list:
$ python -m timeit -s "x = range(10000)" "x is ()"
10000000 loops, best of 3: 0.0401 usec per loop
$ python -m timeit -s "x = range(10000)" "x == ()"
10000000 loops, best of 3: 0.0844 usec per loop
Of course it could be considerably worse if you are comparing against something with a custom __eq__()
method:
$ python -m timeit -s $'import time\nclass X(object):\n def __eq__(self, other): return time.sleep(1)\nx = X()' "x == ()"
10 loops, best of 3: 1e+03 msec per loop
Still, if this efficiency difference is critical, I think that would point to a design problem.
Upvotes: 1
Reputation: 80801
Let's use the id() method to get the internal id of the () :
>>> id(())
140180995895376
>>> empty_tuple = ()
>>> id(empty_tuple)
140180995895376 # same as the id of ()
>>> from copy import copy
>>> id(copy(empty_tuple))
140180995895376 # still the same as the id of ()
It looks like the () is effectively stored as a singleton in python (at least in python>2.6).
There is the same behaviour for the ""
empty string variable.
Upvotes: 0