Reputation: 6187
I have a little test code like so:
import numpy as np
foo = np.zeros(1, dtype=int)
bar = np.zeros((10, 1), dtype=int)
foo_copy = np.copy(foo)
bar[-1] = foo_copy
foo_copy[-1] = 10
print(foo_copy)
print(bar)
I was expecting both foo_copy
and the last element of bar
to contain the value 10, but instead the last element of bar
is still an np array with value 0 in it.
[10]
[[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
[0]] # <<--- why not 10?
Isn't that last element pointing to foo_copy
?
Or in all assignments np will copy the data over and I can't change it by using the original ndarray?
If so, is there a way to keep that last element as a pointer to foo_bar?
Upvotes: 0
Views: 53
Reputation: 231395
A numpy
array have numeric values, not references (at least for numeric dtypes
):
Make a 1d array, and reshape it to 2d:
In [64]: bar = np.arange(12).reshape(4,3)
In [65]: bar
Out[65]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
Another 1d array:
In [66]: foo = np.array([10])
In [67]: foo
Out[67]: array([10])
This assignment is by value:
In [68]: bar[1,1] = foo
In [69]: bar
Out[69]:
array([[ 0, 1, 2],
[ 3, 10, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
So is this, though the values are broadcasted
to the whole row:
In [70]: bar[2] = foo
In [71]: bar
Out[71]:
array([[ 0, 1, 2],
[ 3, 10, 5],
[10, 10, 10],
[ 9, 10, 11]])
We can view
the 2d array as 1d. This is closer representation of how the values are actually stored (but in a c
byte array, 12*8 bytes long):
In [72]: bar1 = bar.ravel()
In [73]: bar1
Out[73]: array([ 0, 1, 2, 3, 10, 5, 10, 10, 10, 9, 10, 11])
Changing an element of view
changes the corresponding element of the 2d:
In [74]: bar1[3] = 30
In [75]: bar
Out[75]:
array([[ 0, 1, 2],
[30, 10, 5],
[10, 10, 10],
[ 9, 10, 11]])
While we can make object
dtype arrays, which store references as lists do, they do not have any performance benefits.
The bytestring containing the 'raw data' of bar
:
In [76]: bar.tobytes()
Out[76]: b'\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x1e\x00\x00\x00\x00\x00\x00\x00\n\x00\x00\x00\x00\x00\x00\x00\x05\x00\x00\x00\x00\x00\x00\x00\n\x00\x00\x00\x00\x00\x00\x00\n\x00\x00\x00\x00\x00\x00\x00\n\x00\x00\x00\x00\x00\x00\x00\t\x00\x00\x00\x00\x00\x00\x00\n\x00\x00\x00\x00\x00\x00\x00\x0b\x00\x00\x00\x00\x00\x00\x00'
The fabled numpy
speed comes from working with this raw data with compiled c
code. Accessing individual elements with the Python code is relatively slow. It's the whole-array operations like bar*3
that are fast.
Upvotes: 2