fengsp
fengsp

Reputation: 1165

Weird id result on CPython IntObject

I write three lines of same code and get different result, firstly I run it in one interactive shell:

>>> a = 10000
>>> b = 10000
>>> a is b
False

>>> a = 10000; b = 10000; a is b
True

Then I have one Python file that contains:

a = 10000
b = 10000
print a is b

I run it and get True

My Python environment:

Python 2.7.5 (default, Mar  9 2014, 22:15:05) 
[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

What is going on here? Everyone is talking about compilation, is there anyone know how does interactive shell compile and run these lines of code?

Upvotes: 4

Views: 193

Answers (4)

filmor
filmor

Reputation: 32182

I think it's a compile-time optimisation. In the first case you are compiling a = 10000 and then b = 10000, so the byte-code compiler has no (simple) way of determining their identity.

In the other cases the compiler sees that a and b are initialised using the same literal and are not changed afterwards.

This has nothing to do with the small integer optimisation. That one also works for expressions, i.e.

>> a = 256; b = 256; a is b
True

but

>> a = 256; b = 256; a + 1 is b + 1
False

The respective code is part of Python's peephole optimization (see https://github.com/python/cpython/blob/master/Python/peephole.c).

Upvotes: 1

Jason S
Jason S

Reputation: 13779

If you put either into a function, it will also evaluate True. What's happening is that Python makes a list of constants used when compiling a function into bytecode, and equal constants will be "collapsed" into one value that is loaded two times. It looks like the interactive interpreter does the same when compiling one line of code*

So here's the bytecode for one of these functions obtained using dis -- it's actually the same for either method except for the line numbers, so I am not copying both here.

  2           0 LOAD_CONST               1 (10000)
              3 STORE_FAST               0 (a)
              6 LOAD_CONST               1 (10000)
              9 STORE_FAST               1 (b)
             12 LOAD_FAST                0 (a)
             15 LOAD_FAST                1 (b)
             18 COMPARE_OP               8 (is)
             21 RETURN_VALUE

This is for:

def func():
    a = 10000; b = 10000; return a is b
from dis import dis
dis(func)

Note that both of the LOAD_CONST lines have the same argument. This is a reference to an index in func.__code__.co_consts which is a tuple. Element 1 of that tuple is the int object 10000.

Just for completeness' sake, here's the disassembly of the original one-liner a = 10000; b = 10000; a is b if you compile() it:

  1           0 LOAD_CONST               0 (10000)
              3 STORE_NAME               0 (a)
              6 LOAD_CONST               0 (10000)
              9 STORE_NAME               1 (b)
             12 LOAD_NAME                0 (a)
             15 LOAD_NAME                1 (b)
             18 COMPARE_OP               8 (is)
             21 POP_TOP
             22 LOAD_CONST               1 (None)
             25 RETURN_VALUE

It's fundamentally similar, except for the line number/const number, NAME vs FAST and the ending from POP_TOP on. Whereas if you assign the values on separate lines, it's not doing this with the constants so it's creating a new int object each time.

*To add a bit more intrigue, if I put the one-line version into my IPython notebook, a is b is False.

Upvotes: 3

fengsp
fengsp

Reputation: 1165

I just figured it out, actually these 10000 that have the same id are located in the compiled code object::

>>> code = compile("a = 10000; b = 10000; a is b", "<string>", "exec")
>>> code.co_consts
(10000, None)

The compiler does some optimization, and 10000 is just created once, because 10000 is immutable:)

Upvotes: 0

BinDu
BinDu

Reputation: 1

although a fresh of python, try to explain this. Statement "is" will figure out the content of the thing is the same or not. In your example, if "is" used, python will check "a" and "b" pointing to the same thing or not. If you change your program like this: a=1 b=1 a is b It will print True May be it relate to the python's method on how to store the value

Upvotes: 0

Related Questions