MaxSchambach
MaxSchambach

Reputation: 13

Creating a non trivial view of numpy array

TL;DR:

I am looking for a way to get a non trivial, and in particular non contigous, view of a numpy ndarray.

E.g., given a 1D ndarray, x = np.array([1, 2, 3, 4]), is there a way to get a non trivial view of it, e.g. np.array([2, 4, 3, 1])?

Longer Version

The context of the question is the following: I have a 4D ndarray of shape (U, V, S, T) which I would like to reshape to a 2D ndarray of shape (U*S, V*T)in a non-trivial way, i.e. a simple np.reshape()does not do the trick as I have a more complex indexing scheme in mind, in which the reshaped array will not be contigous in memory. The arrays in my case are rather large and I would like to get a view and not a copy of the array.

Example

Given an array x(u, v, s, t)of shape (2, 2, 2, 2):

x = np.array([[[[1, 1], [1, 1]],[[2, 2], [2, 2]]],
              [[[3, 3], [3, 3]], [[4, 4], [4, 4]]]])

I would like to get the view z(a, b) of the array:

np.array([[1, 1, 2, 2],
          [1, 1, 2, 2],
          [3, 3, 4, 4],
          [3, 3, 4, 4]])

This corresponds to a indexing scheme of a = u * S + s and b = v * T + t, where in this case S = 2 = T.

What I have tried

  1. Various approaches using np.reshape or even as_strided. Doing standard reshaping will not change the order of elements as they appear in the memory. I tried playing around with order='F' and transposing a bit but had no idea which gave me the correct result.

  2. Since I know the indexing scheme, I tried to operate on the flattened view of the array using np.ravel(). My idea was to create an array of indices follwing the desired indexing scheme and apply it to the flattened array view, but unfortunately, fancy/advanced indexing gives a copy of the array, not a view.

Question

Is there any way to achieve the indexing view that I'm looking for?

In principle, I think this should be possible, as for example ndarray.sort() performs an in place non-trivial indexing of the array. On the other hand, this is probably implemented in C/C++, so it might even not be possible in pure Python?

Upvotes: 1

Views: 234

Answers (1)

hpaulj
hpaulj

Reputation: 231605

Let's review the basics of an array - it has a flat data buffer, a shape, strides, and dtype. Those three attributes are used to view the elements of the data buffer in a particular way, whether it is a simple 1d sequence, 2d or higher dimensions.

A true view than use the same data buffer, but applies different shape, strides or dtype to it.

To get [2, 4, 3, 1] from [1,2,3,4] requires starting at 2, jumping forward 2, then skipping back to 1 and forward 2. That's not a regular pattern that can be represented by strides.

arr[1::2] gives the [2,4], and arr[0::2] gives the [1,3].

(U, V, S, T) to (U*S, V*T) requires a transpose to (U, S, V, T), followed by a reshape

arr.transpose(0,2,1,3).reshape(U*S, V*T)

That will require a copy, no way around that.

In [227]: arr = np.arange(2*3*4*5).reshape(2,3,4,5)
In [230]: arr1 = arr.transpose(0,2,1,3).reshape(2*4, 3*5)
In [231]: arr1.shape
Out[231]: (8, 15)
In [232]: arr1
Out[232]: 
array([[  0,   1,   2,   3,   4,  20,  21,  22,  23,  24,  40,  41,  42,
         43,  44],
       [  5,   6,   7,   8,   9,  25,  26,  27,  28,  29,  45,  46,  47,
         48,  49],
       ....)

Or with your x

In [234]: x1 = x.transpose(0,2,1,3).reshape(4,4)
In [235]: x1
Out[235]: 
array([[1, 1, 2, 2],
       [1, 1, 2, 2],
       [3, 3, 4, 4],
       [3, 3, 4, 4]])

Notice that the elements are in a different order:

In [254]: x.ravel()
Out[254]: array([1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4])
In [255]: x1.ravel()
Out[255]: array([1, 1, 2, 2, 1, 1, 2, 2, 3, 3, 4, 4, 3, 3, 4, 4])

ndarray.sort is in-place and changes the order of bytes in the data buffer. It is operating at a low level that we don't have access to. It isn't a view of the original array.

Upvotes: 1

Related Questions