Reputation: 13
I am looking for a way to get a non trivial, and in particular non contigous, view of a numpy ndarray
.
E.g., given a 1D ndarray, x = np.array([1, 2, 3, 4])
, is there a way to get a non trivial view of it, e.g. np.array([2, 4, 3, 1])
?
The context of the question is the following: I have a 4D ndarray of shape (U, V, S, T)
which I would like to reshape to a 2D ndarray of shape (U*S, V*T)
in a non-trivial way, i.e. a simple np.reshape()
does not do the trick as I have a more complex indexing scheme in mind, in which the reshaped array will not be contigous in memory. The arrays in my case are rather large and I would like to get a view and not a copy of the array.
Given an array x(u, v, s, t)
of shape (2, 2, 2, 2)
:
x = np.array([[[[1, 1], [1, 1]],[[2, 2], [2, 2]]],
[[[3, 3], [3, 3]], [[4, 4], [4, 4]]]])
I would like to get the view z(a, b)
of the array:
np.array([[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]])
This corresponds to a indexing scheme of a = u * S + s
and b = v * T + t
, where in this case S = 2 = T
.
Various approaches using np.reshape
or even as_strided
. Doing standard reshaping will not change the order of elements as they appear in the memory. I tried playing around with order='F'
and transposing a bit but had no idea which gave me the correct result.
Since I know the indexing scheme, I tried to operate on the flattened view of the array using np.ravel()
. My idea was to create an array of indices follwing the desired indexing scheme and apply it to the flattened array view, but unfortunately, fancy/advanced indexing gives a copy of the array, not a view.
Is there any way to achieve the indexing view that I'm looking for?
In principle, I think this should be possible, as for example ndarray.sort()
performs an in place non-trivial indexing of the array. On the other hand, this is probably implemented in C/C++, so it might even not be possible in pure Python?
Upvotes: 1
Views: 234
Reputation: 231605
Let's review the basics of an array - it has a flat data buffer, a shape
, strides
, and dtype
. Those three attributes are used to view
the elements of the data buffer in a particular way, whether it is a simple 1d sequence, 2d or higher dimensions.
A true view
than use the same data buffer, but applies different shape, strides or dtype to it.
To get [2, 4, 3, 1]
from [1,2,3,4]
requires starting at 2
, jumping forward 2, then skipping back to 1 and forward 2. That's not a regular pattern that can be represented by strides
.
arr[1::2]
gives the [2,4], and arr[0::2]
gives the [1,3]
.
(U, V, S, T)
to (U*S, V*T)
requires a transpose to (U, S, V, T)
, followed by a reshape
arr.transpose(0,2,1,3).reshape(U*S, V*T)
That will require a copy, no way around that.
In [227]: arr = np.arange(2*3*4*5).reshape(2,3,4,5)
In [230]: arr1 = arr.transpose(0,2,1,3).reshape(2*4, 3*5)
In [231]: arr1.shape
Out[231]: (8, 15)
In [232]: arr1
Out[232]:
array([[ 0, 1, 2, 3, 4, 20, 21, 22, 23, 24, 40, 41, 42,
43, 44],
[ 5, 6, 7, 8, 9, 25, 26, 27, 28, 29, 45, 46, 47,
48, 49],
....)
Or with your x
In [234]: x1 = x.transpose(0,2,1,3).reshape(4,4)
In [235]: x1
Out[235]:
array([[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]])
Notice that the elements are in a different order:
In [254]: x.ravel()
Out[254]: array([1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4])
In [255]: x1.ravel()
Out[255]: array([1, 1, 2, 2, 1, 1, 2, 2, 3, 3, 4, 4, 3, 3, 4, 4])
ndarray.sort
is in-place and changes the order of bytes in the data buffer. It is operating at a low level that we don't have access to. It isn't a view
of the original array.
Upvotes: 1