user17495296
user17495296

Reputation:

array_2 = array_1 vs. array_2 = array_1.view()

what is the difference between:

array_2 = array_1
array_2 = array_1.view()

like I want an example where the effect of changing array_2 to array in the 1st case does something different in the 2nd case

Upvotes: 4

Views: 252

Answers (1)

juanpa.arrivillaga
juanpa.arrivillaga

Reputation: 95957

The key thing to understand is that assignment never copies or creates a new object. Assignment merely assigns the same object to a new name. It creates an "alias" to the same object - a different name for the same thing. A view creates a new object, which shares the same underlying primitive buffer, the contiguous primitive array where numpy data is stored. Hence, changes to that data are visible to all python objects that use that buffer.

Consider:

>>> import numpy as np
>>> array = np.array([-3, -2, -1, 0, 1, 2, 3, 4], dtype=np.int64)
>>> array_view = array.view()
>>> array_alias = array
>>> array_alias is array
True
>>> array_view is array
False

So array and array_alias are two different names referring to the same Python object, the numpy.ndarray we created at the beginning. The array and array_view are two separate Python objects.

If I modify the object, then obviously both names referring to that same object will be able to see the change:

>>> array.dtype = np.uint64
>>> array
array([18446744073709551613, 18446744073709551614, 18446744073709551615,
                          0,                    1,                    2,
                          3,                    4], dtype=uint64)
>>> array_alias
array([18446744073709551613, 18446744073709551614, 18446744073709551615,
                          0,                    1,                    2,
                          3,                    4], dtype=uint64)

But the view will not:

>>> array_view
array([-3, -2, -1,  0,  1,  2,  3,  4])

However, if I modify the data, since both distinct objects (that is, the original array and the view) are referencing the same underlying buffer, the change is visible to the alias and the view:

>>> array
array([18446744073709551613, 18446744073709551614, 18446744073709551615,
                       1337,                    1,                    2,
                          3,                    4], dtype=uint64)
>>> array_alias
array([18446744073709551613, 18446744073709551614, 18446744073709551615,
                       1337,                    1,                    2,
                          3,                    4], dtype=uint64)
>>> array_view
array([  -3,   -2,   -1, 1337,    1,    2,    3,    4])

Here is another example, using a slice, another operation that creates a view for numpy objects:

>>> array = np.arange(16)
>>> array
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15])
>>> array_slice = array[::4]
>>> array_slice
array([ 0,  4,  8, 12])
>>> array is array_slice
False

I can manipulate the objects independently:

>>> array_slice.shape = (2, 2)
>>> array_slice
array([[ 0,  4],
       [ 8, 12]])
>>> array
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15])

But I cannot manipulate their data independently:

>>> array_slice[:] = 1337
>>> array_slice
array([[1337, 1337],
       [1337, 1337]])
>>> array
array([1337,    1,    2,    3, 1337,    5,    6,    7, 1337,    9,   10,
         11, 1337,   13,   14,   15])

Finally, note, that the np.ndarray.view method is generally invoked when you want to create a new object that is a view of the original array but with a different dtype, so:

>>> array = np.array([-2, -1, 0, 1, 2], dtype=np.int64)
>>> bytewise = array.view(np.uint8)
>>> bytewise
array([254, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255,
       255, 255, 255,   0,   0,   0,   0,   0,   0,   0,   0,   1,   0,
         0,   0,   0,   0,   0,   0,   2,   0,   0,   0,   0,   0,   0,
         0], dtype=uint8)

Let's modify the view's shape in place to take a look at how 64-bit signed integers are represented in raw bytes (i.e. 8-bit integers!):

>>> bytewise.shape = array.shape[0], 8
>>> bytewise
array([[254, 255, 255, 255, 255, 255, 255, 255],
       [255, 255, 255, 255, 255, 255, 255, 255],
       [  0,   0,   0,   0,   0,   0,   0,   0],
       [  1,   0,   0,   0,   0,   0,   0,   0],
       [  2,   0,   0,   0,   0,   0,   0,   0]], dtype=uint8)
>>> array
array([-2, -1,  0,  1,  2])

Upvotes: 7

Related Questions