Reputation: 63
Consider the following minimal example. Can somebody explain the apparently inconsistent logic of numpy
when it comes to copying list elements of varying nesting depths?
import numpy as np
L = [[[[1, 1], 2, 3]]]
A1 = np.array(L)
A2 = A1.copy()
A1[0][0][2] = 'xx'
A1[0][0][0][0] = 'yy'
print "\nA1 after changes:\n{}".format(A1)
print "\nA2 only partially changed:\n{}".format(A2)
Results:
A1 after changes:
[[[['yy', 1] 2 'xx']]]
A2 only partially changed:
[[[['yy', 1] 2 3]]]
Then:
>>> print A1[0][0][2] == A2[0][0][2]
False
>>> print A1[0][0][0][0] == A2[0][0][0][0]
True
I have a hard time explaining to myself why 3
is not replaced, but 1
in a deeper level is.
A2 = np.array(A, copy=True)
and A2 = np.empty_like(A); np.copyto(A4, A)
behave the same as the code above
A2 = A[:]
behaves the same as A2 = A
: both are identical after changes
import copy; A2 = copy.deepcopy(A)
is the only solution I found to create an independent copy.
Upvotes: 0
Views: 214
Reputation: 231385
Look at your array, and understand its structure first:
In [139]: A1
Out[139]: array([[[[1, 1], 2, 3]]], dtype=object)
In [140]: A1.shape
Out[140]: (1, 1, 3)
It's a dtype=object
array; that is the elements are object pointers, not numbers. Also it is 3d, with 3 elements.
In [142]: A1[0,0]
Out[142]: array([[1, 1], 2, 3], dtype=object)
Since it is an array, A1[0,0]
is better than A1[0][0]
. Functionally the same, but clearer. A1[0,0,:]
is even better. Anyways, at this level we still have an array with shape (3,)
, i.e. 1d with 3 elements.
In [143]: A1[0,0,0]
Out[143]: [1, 1]
In [144]: A1[0,0,2]
Out[144]: 3
Now we get a list and numbers, the individual elements of A1
. The list is mutable, the number is not.
We can change the 3rd element (a number) to a string:
In [148]: A1[0,0,2]='xy'
To change an element of the 1st element, a list, I have to use the mixed indexing, not a 4 level array indexing.
In [149]: A1[0,0,0,0]
...
IndexError: too many indices for array
In [150]: A1[0,0,0][0]='yy'
In [151]: A1
Out[151]: array([[[['yy', 1], 2, 'xy']]], dtype=object)
A1
is still a 3d object array; we have just change a couple of elements. The 'xy' change is different from the 'yy' change. One changed the array, the other a list element of the array.
A2=A1.copy()
makes a new array with copies of the elements (the data buffer) of A1
. So A2
has pointers to the same objects as A1
.
The 'xy' changed the pointer in A1
, but did not change the A2
copy.
The 'yy' change modified the list pointed to by A1
. A2
has a pointer to the same list, so it sees the change.
Note that L
, the original nested list sees the same change:
In [152]: L
Out[152]: [[[['yy', 1], 2, 3]]]
A3 = A[:]
produces a view
of A1
. A3
has the same data buffer as A1
, so it sees all the changes.
A4 = A
would also see the same changes, but A4
is a new reference to A1
, not a view or a copy.
The duplicate answer
that was raised earlier dealt with references, copies and deep copies of lists. That is relevant here because L
is a list, and A1
is an object array, which in many ways is an array wrapper around a list. But A1
is also numpy array, which has the added distinction between view
and copy
.
This is not a good use of numpy
arrays, not even the object dtype version. It's an instructive example, but too confusing to be practical. If you need to do a deepcopy
on an array, you probably are using arrays wrong.
Upvotes: 1