Reputation: 2783
Suppose I have the following class:
class A:
arr = []
If I append to arr
for an instance of A
, all instances of A
are updated.
>>> a1, a2 = A(), A()
>>> a1.arr.append(0)
>>> a1.arr
[0]
>>> a2.arr
[0]
>>> A.arr
[0]
However, if I set arr
to an array literal for an instance of A
, other instances are not updated:
>>> a1.arr = [1,2,3]
>>> a1.arr
[1, 2, 3]
>>> a2.arr
[0]
>>> A.arr
[0]
Why does this occur? When the class attribute is a list, why are there different results between append
and =
?
I also noticed similar behavior when the class attribute is not an array:
class B:
value = ''
>>> b1, b2 = B(), B()
>>> b1.value = 'hello'
>>> b1.value
'hello'
>>> b2.value
''
>>> B.value
''
>>> B.value = 'goodbye'
>>> b1.value
'hello'
>>> b2.value
'goodbye'
>>> B.value
'goodbye'
Why does the behavior seem different when the class attribute is a string? When b1
's value is already set, why does B.value = ...
only update b2
's value and not b1
's?
Upvotes: 2
Views: 3487
Reputation: 36
In a short word, Python doesn't have real variable. The variable you see is actually a name(like the alias in other languages). And the operator =
, which is always called Assignment, bind the name to an object. (In Python, everything is an object)
For example:
x = 3
The =
doesn't really change the value of x
, coz actually there is no variable x
contain a value.
Instead, it creates a immutable object 3
and makes x
a name bind to it.(similar to C++'s reference)
So, if we do
>>> a = [1,2]
>>> b = a
>>> print(id(a)) # id(object) will return the address of object in memory
2426261961288
>>> print(id(b))
2426261961288
>>> a is b # operator "is" evaluate whether a and b refer to the same object.
True
>>> b.append(3)
>>> print(id(b)) # b's address didn't change
2426261961288
>>> print(a)
[1, 2, 3]
>>> print(b)
[1, 2, 3]
First, a = [1,2]
binds the name a
to an mutable object, which is a list [1, 2]
.(For better understanding, I would annotate this underlying object a nickname OBJ_288)
Then, b = a
binds b
to the same object which a
reference to, OBJ_288.
You can see, id(a)
is the same as id(b)
, which means their addresses are the same.
b.append(3)
actually change the object b
is bound to(as b.append
refers to a method of OBJ_288). Now OBJ_288 becomes [1, 2, 3]
, to whom a
and b
are bound to.
So when we print(a)
and print(b)
, the results are the same.
However, if we do
>>> b = [4, 5, 6]
>>> a is b
False
>>> id(a)
2426261961288
>>> id(b)
2426262048840
>>> print(a)
[1, 2, 3]
When we call operator=
for b
, b
will bind to another object (here is the new object we created by [4, 5, 6], let's nickname it OBJ840)
While a
still refers to OBJ_288, print(a)
is still [1, 2, 3]
For detail, please see the following references (If you have knowledge of C++, you could understand the first 2 references easier):
https://realpython.com/pointers-in-python/#names-in-python
https://eev.ee/blog/2012/05/23/python-faq-passing/
Also, the detailed rules are stated in Python Official Reference.
https://docs.python.org/3/reference/executionmodel.html#naming-and-binding
In your code:
class A:
arr = []
>>> a1, a2 = A(), A()
>>> a1.arr.append(0)
a1
is an instance of class A
, and in Python, for attribute name resolution, every instance creates a namespace, which is the first place in which attributes references are searched, if not found, it will continue searching in its class namespace(where the class attributes are).
So in your case, since a1
has no instance attr named arr
, it refers to class A's attribute A.arr
. and append
is A.arr
's method, which would modify A.arr
, which results:
>>> A.arr
[0]
But if you do
>>> a1.arr = [1,2,3]
Remember I said in Name and Bind: What does "=" do, the = assignment would binds its left hand side name to its right hand side object.
Also, in Python, assignments to names always go into the innermost scope(except specified by global
or nonlocal
). Here it means, it will bind object [1,2,3] to a1.arr
, which is a1
's instance attribute, even it doesn't exist before.
Now a1
has a new instance attribute arr
, so a1.arr
, as the attribute name resolution rule, it will shadow A.arr
. That's why:
>>> a1.arr
[1, 2, 3]
And class A's class attribute A.arr
is not affected.
>>> a2.arr
[0]
>>> A.arr
[0]
Reference:
https://docs.pythonorg/3/reference/datamodel.html#the-standard-type-hierarchy item Class instances
https://docs.python.org/3/tutorial/classes.html#a-word-about-names-and-objects
Upvotes: 0
Reputation: 472
If your asking how to avoid this?
class A:
def __init__(self):
self.ls = []
a,b = A(), A()
a.ls.append(0)
using
__init__() will make the instances individual
Here's an example of what you're doing..
class B:
ls = []
def __init__(self):
pass
c,d = B(),B()
c.ls == d.ls
Out[21]: True
As you can see they still reference the same variable from the B's object.
This is because
One is a class attribute, while the other is an instance attribute.
So the list ls that is declared outside the instantiation is being shared by all the instances of B
.
Upvotes: -1
Reputation: 3401
When you define a class variable
, and you assign a list to it, the address of the list would be assigned to the class variable
:
class A:
arr = []
That's why in the first case, when you append 0
to arr
, it would be added to all object's arr
.
When you assign a1.arr = [1,2,3]
, the address of arr
in object a1
changes, that's the reason a2.arr
doesn't change !
And about the second case, you are assigning the value of a string variable to value
. so if you change b1.value
, it doesn't change b2.value
class B:
value = ''
By the way in other languages, this problem is exactly about the difference between reference
and value
.
Upvotes: 1
Reputation: 77857
You're confused over the handling of class attributes and instance attributes. An instance attribute will default to the class attribute. However, when you specifically change an instance, you create an instance attribute. Let's walk through your sequence with class B:
class B:
value = ''
# You have a single attribute, `B.value`
b1, b2 = B(), B()
b1.value = 'hello'
# This shadows b1's reference to B.value,
# inserting a local reference to its own attribute of the same name.
# You can check this with the id() function
b2.value # this still refers to the class attribute.
Is it clear from here?
Upvotes: 3
Reputation: 2706
class C:
class_attribute=2
def __init__(self):
self.instance_attribute='boo'
if you query an attribute of a class instance (my_instance.foo
), the returned value is instance attribute if it exists, if not class attribute
if you assign to an instance(my_instance.foo = 42
), an instance attribute is created, it can have the same name, and it shadows the class attribute
Upvotes: 0
Reputation: 1212
I believe this answer explains what's happening.
In the class A
, arr
is a class attribute:
...all instances of Foo [A] share foovar [arr]
When you .append()
, you're operating directly on the list object arr
. When you assign (a1.arr = [1, 2, 3]
), you're creating a new list object and assigning it as an instance attribute (effectively self.arr
) on a1
that takes priority over the class attribute A.arr
.
If we don't touch foovar, it's the same for both f and Foo. But if we change f.foovar... << code snippet >> ...we add an instance attribute that effectively masks the value of Foo.foovar. Now if we change Foo.foovar directly, it doesn't affect our foo instance:
Upvotes: 1