Chris Harris
Chris Harris

Reputation: 197

Python: accidentally created a reference but not sure how

I imagine this is one in a very long list of questions from people who have inadvertantly created references in python, but I've got the following situation. I'm using scipy minimize to set the sum of the top row of an array to 5 (as an example).

class problem_test:
    def __init__(self):
        test_array = [[1,2,3,4,5,6,7],
                      [4,5,6,7,8,9,10]]

    def set_top_row_to_five(x, array):
        array[0] = array[0] + x
        return abs(sum(array[0]) - 5)

    adjustment = spo.minimize(set_top_row_to_five,0,args=(test_array))

    print(test_array)
    print(adjustment.x)

ptest = problem_test()

However, the optimization is altering the original array (test_array):

[array([-2.03, -1.03, -0.03,  0.97,  1.97,  2.97,  3.97]), [4, 5, 6, 7, 8, 9, 10]]
[-0.00000001]

I realize I can solve this using, for example, deepcopy, but I'm keen to learn why this is happening so I don't do the same in future by accident.

Thanks in advance!

Upvotes: 0

Views: 174

Answers (1)

Ondrej K.
Ondrej K.

Reputation: 9664

Names are references to objects. What is to observe is whether the objects (also passed in an argument) is modified itself or a new object is created. An example would be:

>>> l1 = list()
>>> l2 = l1
>>> l2.append(0)  # this modifies object currently reference to by l1 and l2
>>> print(l1)
[0]

Whereas:

>>> l1 = list()
>>> l2 = list(l1)  # New list object has been created with initial values from l1
>>> l2.append(0)
>>> print(l1)
[]

Or:

>>> l1 = list()
>>> l2 = l1
>>> l2 = [0]  # New list object has been created and assigned to l2
>>> l2.append(0)
>>> print(l1)
[]

Similarly assuming l = [1, 2, 3]:

>>> def f1(list_arg):
...    return list_arg.reverse()
>>> print(f1, l)
None [3, 2, 1]

We have just passed None returned my list.reverse method through and reversed l (in place). However:

>>> def f2(list_arg):
...     ret_list = list(list_arg)
...     ret_list.reverse()
...     return ret_list
>>> print(f2(l), l)
[3, 2, 1] [1, 2, 3]

Function returns a new reversed object (initialized) from l which remained unchanged (NOTE: in this exampled built-in reversed or slicing would of course make more sense.)

When nested, one must not forget that for instance:

>>> l = [1, 2, 3]
>>> d1 = {'k': l}
>>> d2 = dict(d1)
>>> d1 is d2
False
>>> d1['k'] is d2['k']
True

Dictionaries d1 and d2 are two different objects, but their k item is only one (and shared) instance. This is the case when copy.deepcopy might come in handy.

Care needs to be taken when passing objects around to make sure they are modified or copy is used as wanted and expected. It might be helpful to return None or similar generic value when making in place changes and return the resulting object when working with a copy so that the function/method interface itself hints what the intention was and what is actually going on here.

When immutable objects (as the name suggests) are being "modified" a new object would actually be created and assigned to a new or back to the original name/reference:

>>> s = 'abc'
>>> print('0x{:x} {}'.format(id(s), s))
0x7f4a9dbbfa78 abc
>>> s = s.upper()
>>> print('0x{:x} {}'.format(id(s), s))
0x7f4a9c989490 ABC

Note though, that even immutable type could include reference to a mutable object. For instance for l = [1, 2, 3]; t1 = (l,); t2 = t1, one can t1[0].append(4). This change would also be seen in t2[0] (for the same reason as d1['k'] and d2['k'] above) while both tuples themselves remained unmodified.


One extra caveat (possible gotcha). When defining default argument values (using mutable types), that default argument, when function is called without passing an object, behaves like a "static" variable:

>>> def f3(arg_list=[]):
...     arg_list.append('x')
...     print(arg_list)
>>> f3()
['x']
>>> f3()
['x', 'x']

Since this is often not a behavior people assume at first glance, using mutable objects as default argument value is usually better avoided.

Similar would be true for class attributes where one object would be shared between all instances:

>>> class C(object):
...     a = []
...     def m(self):
...         self.a.append('x')  # We actually modify value of an attribute of C
...         print(self.a)
>>> c1 = C()
>>> c2 = C()
>>> c1.m()
['x']
>>> c2.m()
['x', 'x']
>>> c1.m()
['x', 'x', 'x']

Note what the behavior would be in case of class immutable type class attribute in a similar example:

>>> class C(object):
...     a = 0
...     def m(self):
...         self.a += 1  # We assign new object to an attribute of self
...         print(self.a)
>>> c1 = C()
>>> c2 = C()
>>> c1.m()
1
>>> c2.m()
1
>>> c1.m()
2

All the fun details can be found in the documentation: https://docs.python.org/3.6/reference/datamodel.html

Upvotes: 1

Related Questions