Reputation: 321
Here is my Code
my_array_1 = np.arange(25).reshape(5, 5)
print(my_array_1)
my_array_red = my_array_1[:, 1::2]
print(my_array_red)
my_array_blue = my_array_1[1::2, 0:3:2]
print(my_array_blue)
my_array_yellow = my_array_1[-1, :]
print(my_array_yellow)
print(id(my_array_1))
print(id(my_array_red))
print(id(my_array_yellow))
print(id(my_array_blue))
print(my_array_1.data)
print(my_array_red.data)
print(my_array_blue.data)
print(my_array_yellow.data)
Here is the Output:
[[ 0 1 2 3 4]
[ 5 6 7 8 9]
[10 11 12 13 14]
[15 16 17 18 19]
[20 21 22 23 24]]
[[ 1 3]
[ 6 8]
[11 13]
[16 18]
[21 23]]
[[ 5 7]
[15 17]]
[20 21 22 23 24]
2606769150592
2606769282544
2607017647120
2606769282624
<memory at 0x0000025EFE56CA68>
<memory at 0x0000025EFE56CA68>
<memory at 0x0000025EFE56CA68>
<memory at 0x0000025EFE5A8F48>
Question : Just check the last 4 lines of my output . why my_array_1.data.data , my_array_red.data, my_array_blue.data have same value , But where as my_array_yellow.data have a different value ?
Upvotes: 3
Views: 120
Reputation: 231385
I find the data
value of the __array_interface__
to be more informative:
In [2]: my_array_1.__array_interface__['data']
Out[2]: (33691856, False)
In [3]: my_array_red.__array_interface__['data']
Out[3]: (33691864, False)
In [4]: my_array_blue.__array_interface__['data']
Out[4]: (33691896, False)
In [5]: my_array_yellow.__array_interface__['data']
Out[5]: (33692016, False)
Out[2]
is the start of the data buffer.
red
is 8 bytes larger - that is one element from the start.
blue
is 40 bytes in - the next row
In [8]: my_array_1.strides
Out[8]: (40, 8)
yellow
is 160 bytes in - that's the start of the last row (40 from the end)
In [9]: 2016-1856
Out[9]: 160
In [10]: my_array_1.nbytes
Out[10]: 200
The data
addresses all differ, but are in the same ballpark. But they are harder to interpret.
In [11]: my_array_1.data
Out[11]: <memory at 0x7fa975369a68>
In [12]: my_array_red.data
Out[12]: <memory at 0x7fa975369b40>
In [13]: my_array_blue.data
Out[13]: <memory at 0x7fa975369c18>
In [14]: my_array_yellow.data
Out[14]: <memory at 0x7fa9710f11c8>
The data
attribute can be used in an ndarray
constructor:
Two elements from yellow
:
In [17]: np.ndarray(2,dtype=my_array_1.dtype,buffer=my_array_yellow.data)
Out[17]: array([20, 21])
Same 2 elements, but with the original address, and an offset (as deduced above):
In [18]: np.ndarray(2,dtype=my_array_1.dtype,buffer=my_array_1.data, offset=160)
Out[18]: array([20, 21])
Actually the data
display doesn't tell us anything about where the data buffer is located. It's the address of the memoryview
object that references the buffer, not the address of the buffer itself. Call data
again, and get a different memoryview object:
In [19]: my_array_1.data
Out[19]: <memory at 0x7fa975369cf0>
If I print these memoryview objects, I get the same pattern as you do:
In [22]: print(my_array_1.data)
<memory at 0x7fa970e37120>
In [23]: print(my_array_red.data)
<memory at 0x7fa970e37120>
In [24]: print(my_array_blue.data)
<memory at 0x7fa970e37120>
In [25]: print(my_array_yellow.data)
<memory at 0x7fa9710f17c8>
For 23 and 24, it's just reusing a memory slot, because with print there's no persistence. I'm not sure why yellow
doesn't reuse it, except maybe the object is sufficiently different that it doesn't fit in the same space. In the Out[11]
etc. cases, the ipython
buffering hangs onto those objects, and thus there's not reuse.
It just reinforces the idea that there's nothing significant about the print display of these memoryview objects
. It has nothing to do with the databuffer location. It's more like the id
, just an arbitrary place in memory.
Upvotes: 2