Why are larger int numbers returning different and sometimes the same ids?

Question

Today I learned about id and decided to put it to use and test things out. I know integers are immutable, so the id should be (?) the same. But as I was testing things out in the prompt, I noticed slight differences and wanted to find out the reason behind it.

a = 1
id(a)    # 10055552
id(1)    # 10055552
a = int(1)
id(a)    # 10055552

Cool! All checks out so far. But then...

a = 10000
id(a)           # 140230117375888
id(10000)       # 140230116779920
a = int(10000)
id(a)           # 140230116779920

# wait what?? try it again
id(10000)       # 140230116780080
# Huh!?

Ok, so testing things out, I noticed this behavior happened until 256. The id would be up to 8 digits long, and then 257 would return a larger id of 15 digits long. So int types need to be 8 bytes.. Testing this out:

a = 256
id(a)     # 10063712
id(256)   # 10063712

a = 257
id(a)     # 140230116780080
id(257)   # 140230117375888
a = int(257)
id(a)     # 140230117375888
id(257)   # 140230116779920

So I figured out it has something to do with being 8 bytes long, but anything larger than 256 would re-use some of the same ids:

140230116780080
140230116780048
140230116780144
140230117375888
140230116779920

Please note the above list is non-exhaustive.

What is happening here under-the-hood? Why are some ids being re-used? Testing out multiple variables:

a = 257
b = 258
c = 259

id(a)      # 140230116780176
id(257)    # 140230116779984   <--- reused?
id(b)      # 140230116780080
id(258)    # 140230116780144
id(c)      # 140230116780048
id(259)    # 140230116779984   <--- reused?

id(257) == id(259)    # False

TL;DR - For integers above 256, why are some of the ids reused? I thought these id's were supposed to be unique during their lifetime, but some of these id's look identical but when comparing them, they are different? Please look at last example provided.

Also, why are there a select few id's used for these larger integers? Maybe this is different for systems using many more variables?

Tested this on Python 3.4.3, GCC 4.8.4 on linux.

jasonharper · Accepted Answer

As an optimization, Python pre-creates a range of int objects (I think it's -5...256 by default, this is a compile-time option), and always uses those objects in preference to creating a new int. For ints outside that range, the chance of ever needing the exact same int again is considered too low to be worth the effort of checking to see if the needed int object already exists.

This is PURELY an implementation detail. If your code ever actually cares about it, you are doing something horribly wrong.

Why are larger int numbers returning different and sometimes the same ids?

Answers (1)

Related Questions