Khris
Khris

Reputation: 3212

Python class variable getting altered by changing instance variable that should just take its value

I'm stumbling over a weird effect when initializing a Python class. Not sure if I'm overlooking something obvious or not.

First things first, I'm aware that apparently lists passed to classes are passed by reference while integers are passed by value as shown in this example:

class Test:
  def __init__(self,x,y):
    self.X = x
    self.Y = y
    self.X += 1
    self.Y.append(1)

x = 0
y = []
Test(x,y)
Test(x,y)
Test(x,y)
print x, y

Yielding the result:

0 [1, 1, 1]

So far so good. Now look at this example:

class DataSheet:
  MISSINGKEYS = {u'Item': ["Missing"]}

  def __init__(self,stuff,dataSheet):
    self.dataSheet = dataSheet
    if self.dataSheet.has_key(u'Item'):
      self.dataSheet[u'Item'].append(stuff[u'Item'])
    else:
      self.dataSheet[u'Item'] = self.MISSINGKEYS[u'Item']

Calling it like this

stuff = {u'Item':['Test']}
ds = {}
DataSheet(stuff,ds)
print ds
DataSheet(stuff,ds)
print ds
DataSheet(stuff,ds)
print ds

yields:

{u'Item': ['Missing']}
{u'Item': ['Missing', ['Test']]}
{u'Item': ['Missing', ['Test'], ['Test']]}

Now lets print MISSINGKEYS instead:

stuff = {u'Item':['Test']}
ds = {}
DataSheet(stuff,ds)
print DataSheet.MISSINGKEYS
DataSheet(stuff,ds)
print DataSheet.MISSINGKEYS
DataSheet(stuff,ds)
print DataSheet.MISSINGKEYS

This yields:

{u'Item': ['Missing']}
{u'Item': ['Missing', ['Test']]}
{u'Item': ['Missing', ['Test'], ['Test']]}

The exact same output. Why?

MISSINGKEYS is a class variable but at no point is it deliberately altered.

In the first call the class goes into this line:

self.dataSheet[u'Item'] = self.MISSINGKEYS[u'Item']

Which apparently starts it all. Obviously I only want self.dataSheet[u'Item'] to take the value of self.MISSINGKEYS[u'Item'], not to become a reference to it or something like that.

In the following two calls the line

self.dataSheet[u'Item'].append(stuff[u'Item'])

is called instead and the append works on self.dataSheet[u'Item'] AND on self.MISSINGKEYS[u'Item'] which it should not.

This leads to the assumption that after the first call both variables now reference the same object.

However although being equal they do not:

ds == DataSheet.MISSINGKEYS
Out[170]: True
ds is DataSheet.MISSINGKEYS
Out[171]: False

Can someone explain to me what is going on here and how I can avoid it?

EDIT: I tried this:

ds[u'Item'] is DataSheet.MISSINGKEYS[u'Item'] 
Out[172]: True

So okay, this one entry in both dictionaries references the same object. How can I just assign the value instead?

Upvotes: 0

Views: 111

Answers (2)

PM 2Ring
PM 2Ring

Reputation: 55499

Thinking about what happens in Python function calls in terms of "pass by reference" vs "pass by value" is not generally useful; some people like to use the term "pass by object". Remember, everything in Python is an object, so even when you pass an integer to a function (in C terminology) you're actually passing a pointer to that integer object.

In your first code block you do

self.X += 1

This doesn't modify the current integer object bound to self.X. It creates a new integer object with the appropriate value and binds that object to the self.X name.

Whereas, with

self.Y.append(1)

you are mutating the current list object that's bound to self.Y, which happens to be the list object that was passed to Test.__init__ as its y parameter. This is the same y list object in the calling code, so when you modify self.Y you are changing that y list object in the calling code. OTOH, if you did an assignment like

self.Y = ['new stuff']

then the name self.Y would be bound to the new list, and the old list (which is still bound to y in the calling code) would be unaffected.

You may find this article helpful: Facts and myths about Python names and values, which was written by SO veteran Ned Batchelder.

Upvotes: 1

juanpa.arrivillaga
juanpa.arrivillaga

Reputation: 96287

Here:

 else:
  self.dataSheet[u'Item'] = self.MISSINGKEYS[u'Item']

You are setting dataShee['Item'] with the list that is the value of MISSINGKEYS['Item']. The same list. Try

 else:
  self.dataSheet[u'Item'] = list(self.MISSINGKEYS[u'Item']) 

To make a copy.

Upvotes: 1

Related Questions