Reputation: 1165
I write three lines of same code and get different result, firstly I run it in one interactive shell:
>>> a = 10000
>>> b = 10000
>>> a is b
False
>>> a = 10000; b = 10000; a is b
True
Then I have one Python file that contains:
a = 10000
b = 10000
print a is b
I run it and get True
My Python environment:
Python 2.7.5 (default, Mar 9 2014, 22:15:05)
[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
What is going on here? Everyone is talking about compilation, is there anyone know how does interactive shell compile and run these lines of code?
Upvotes: 4
Views: 193
Reputation: 32182
I think it's a compile-time optimisation. In the first case you are compiling a = 10000
and then b = 10000
, so the byte-code compiler has no (simple) way of determining their identity.
In the other cases the compiler sees that a
and b
are initialised using the same literal and are not changed afterwards.
This has nothing to do with the small integer optimisation. That one also works for expressions, i.e.
>> a = 256; b = 256; a is b
True
but
>> a = 256; b = 256; a + 1 is b + 1
False
The respective code is part of Python's peephole optimization (see https://github.com/python/cpython/blob/master/Python/peephole.c).
Upvotes: 1
Reputation: 13779
If you put either into a function, it will also evaluate True
. What's happening is that Python makes a list of constants used when compiling a function into bytecode, and equal constants will be "collapsed" into one value that is loaded two times. It looks like the interactive interpreter does the same when compiling one line of code*
So here's the bytecode for one of these functions obtained using dis
-- it's actually the same for either method except for the line numbers, so I am not copying both here.
2 0 LOAD_CONST 1 (10000)
3 STORE_FAST 0 (a)
6 LOAD_CONST 1 (10000)
9 STORE_FAST 1 (b)
12 LOAD_FAST 0 (a)
15 LOAD_FAST 1 (b)
18 COMPARE_OP 8 (is)
21 RETURN_VALUE
This is for:
def func():
a = 10000; b = 10000; return a is b
from dis import dis
dis(func)
Note that both of the LOAD_CONST
lines have the same argument. This is a reference to an index in func.__code__.co_consts
which is a tuple. Element 1 of that tuple is the int object 10000
.
Just for completeness' sake, here's the disassembly of the original one-liner a = 10000; b = 10000; a is b
if you compile()
it:
1 0 LOAD_CONST 0 (10000)
3 STORE_NAME 0 (a)
6 LOAD_CONST 0 (10000)
9 STORE_NAME 1 (b)
12 LOAD_NAME 0 (a)
15 LOAD_NAME 1 (b)
18 COMPARE_OP 8 (is)
21 POP_TOP
22 LOAD_CONST 1 (None)
25 RETURN_VALUE
It's fundamentally similar, except for the line number/const number, NAME
vs FAST
and the ending from POP_TOP
on. Whereas if you assign the values on separate lines, it's not doing this with the constants so it's creating a new int object each time.
*To add a bit more intrigue, if I put the one-line version into my IPython notebook, a is b
is False
.
Upvotes: 3
Reputation: 1165
I just figured it out, actually these 10000 that have the same id are located in the compiled code object::
>>> code = compile("a = 10000; b = 10000; a is b", "<string>", "exec")
>>> code.co_consts
(10000, None)
The compiler does some optimization, and 10000 is just created once, because 10000 is immutable:)
Upvotes: 0
Reputation: 1
although a fresh of python, try to explain this. Statement "is" will figure out the content of the thing is the same or not. In your example, if "is" used, python will check "a" and "b" pointing to the same thing or not. If you change your program like this: a=1 b=1 a is b It will print True May be it relate to the python's method on how to store the value
Upvotes: 0