Psyduck
Psyduck

Reputation: 667

Difference between Assignment and deep copy on a class object

I have heard about the difference between shallow copy and assignment in python,

"A shallow copy constructs a new object, while an assignment will simply point the new variable at the existing object. Any changes to the existing object will affect both variables (with assignment)"

In the example below, I have a class contains 2 variables, train_data and train_labels.

class test(object):


    def __init__(self, train_data, train_labels):
        self.train_data = train_data  
        self.train_labels = train_labels

    def fit(self, train_data, train_labels):
        self.train_data = train_data
        self.train_labels = train_labels

I created a class object, get the train_data of the class object A as initial_train_data. Then changed the train_data of the class object A to [1,2,3]. Finally, I checked initial_train_datavariable again:

A = test([1,2,3,4,5], ['a','b','c','d','e'])
initial_train_data = A.train_data
>>> A.train = [1,2,3,4,5]
A.train_data = [1,2,3]
>>> A.train = [1,2,3]
print(initial_train_data)
>>> initial_train_data = [1,2,3,4,5]

I am confused that I thought with :initial_train_data = A.train_data, I just assigned A.train_data's memory location to initial_train_data, and as I change A.train_data, the initial_train_data should change as well. But it didn't

Can someone explain me the reason?

Upvotes: 1

Views: 125

Answers (1)

cs95
cs95

Reputation: 402563

There's a difference between modifying the object a variable points to, and reassigning the variable itself.

Consider A = [1, 2, 3]. Now, setting B = A means that both A and B point to the same object in memory. Keep in mind that A and B are just references and are not related to each other in any way besides the fact that they point to the same object in memory.

Now, if you were to perform A[0] = 999, then, you could print B[0] and see the same 999 being displayed, because they still point to the same object. However, if you set A = [4, 5, 6], then that has no impact on B, which is still pointing to [999, 2, 3].

For more details (and pictures!), I'd recommend referring to SO veteran Ned Batchelder's HOWTO on variable names and references.

Upvotes: 1

Related Questions