Reputation: 4829
Python's id()
function returns the unique identifier for an object. So when in my terminal I do something like:
>> a = 23
>> id(a)
28487496
Now, I know that python keeps track of all the objects created and number of references to that object and when the value reaches 0, the object is garbage collected.
What I want to know is that what happens when I do something like this:
>> id(27)
28487498
I never created an object with value 27 i.e i never wrote b=27
still somehow I get a unique identifier for this. Does this mean that an object was created in memory? If yes, even then there should be 0 references to this object and it should have been garbage collected.
So, when is an Object actually created in Memory ?
Please let me know if i am wrong somewhere.
Another interesting thing that I just found out is:
>> a = 23
>> id(a)
28487496
>> id(20 + 3)
28487496
In this case Python remembers the reference to number 23 itself, how does Python do this?
Upvotes: 2
Views: 304
Reputation: 1124000
Objects are created as needed, in different places.
To start, when you write
b = 27
two things happen. The 27
expression is evaluated, resulting in an integer object being pushed onto the stack, and then, as a separate step, the object is assigned to b
. Assignment doesn't create objects.
If you did just this:
27
The 27
expression is still evaluated. The object would be created*, then destroyed again as the reference count drops back to 0 again.
That's needed because you could pass that object to another function:
id(27)
needs something to be passed to the id()
function. So 27
is added to the stack so you can call the function.
I'll use a mutable object instead of an integer, to illustrate that a new object is created; so instead of id(27)
I'll use id([])
and ask the dis
module to show me the bytecode that Python would execute:
>>> import dis
>>> dis.dis(compile('id([])', '', 'exec'))
1 0 LOAD_NAME 0 (id)
2 BUILD_LIST 0
4 CALL_FUNCTION 1
6 POP_TOP
8 LOAD_CONST 0 (None)
10 RETURN_VALUE
The BUILD_LIST 0
opcode is used to create the empty list object and push it onto the stack, and CALL_FUNCTION 1
then calls id
to passing in one value from the stack, which is that list.
I didn't use id(27)
because immutable objects like integers and tuples and such are actually cached with the bytecode that is compiled; these are created when Python compiles the code (or when you load the .pyc
bytecode cache from disk):
>>> dis.dis(compile('id(27)', '', 'exec'))
1 0 LOAD_NAME 0 (id)
2 LOAD_CONST 0 (27)
4 CALL_FUNCTION 1
6 POP_TOP
8 LOAD_CONST 1 (None)
10 RETURN_VALUE
Note the LOAD_CONST
, it loads the data from the co_consts
structure:
>>> compile('id(27)', '', 'exec').co_consts
(27, None)
So objects can be created when compiling, or when execuning special opcodes for specific Python syntax.
There are more places:
type.__new__
will create an instance object on the heap. So CustomClass(arg1, arg2)
creates an object with the right type.int(somevalue)
creates an integer object on the heap.class
, def
statements and the lambda
expression create objects (class objects, functions, and more functions, these are all objects too).* Small integers are actually interned; for performance reasons, CPython keeps a single copy each of the integers between -5 and and 256, so these objects are actually created only once, and referenced everywhere you need one. See "is" operator behaves unexpectedly with integers. For the purposes of this answer I'm ignoring this.
And because they are interned, the result of 20 + 3
returns that single copy and the id()
will still be the same as if you asked for id(23)
directly.
There are more implementation details; there are many more. Some string objects are interned (see my answer here). Code evaluated in the interactive interpreter is compiled one top-level block at a time, but in a script compilation is done per scope instead. Because constants are attached to compiled code objects, that means that there are differences as to when constants are shared. Etc. etc.
The only objects you can rely on not being recreated all the time are explicitly documented in the datamodel documentation as being singletons; None
being the most prominent of these.
Upvotes: 4