Reputation: 3412
I have an numpy array of shape (1000,100) I would like to create a new array containing the first 100 rows and then all the rows between 200th and 299th (boundaries included). Is there a way to do it using only views, without copying all the data of the array?
Upvotes: 1
Views: 89
Reputation: 880547
Unfortunately, not.
Here is why: A NumPy array draws data from an underlying block of contiguous memory. The dtype, shape, and strides of the array determine how the data in that block of memory is to be interpreted as values.
Since an array can have only one strides attribute, the values have to be regularly spaced. Therefore, an array can not be a view of another array which takes values from the original array at irregularly spaced intervals.
Note, however, that Divakar shows that by a clever reshaping to a 3D array, the desired values can be viewed as a slice with a regularly spaced stride. So if you are willing to add another dimension, it is possible to create a view with the desired values.
Building on Divakar's answer, you could also use a.reshape(10,-1,a.shape[1])[:3:2]
. This breaks the array into 10 chunks, then slices off the first 3, and steps by 2 -- giving you only the first and third chunks.
Upvotes: 2
Reputation: 221664
You could have a 3D array of shape (2,100,100)
with some slicing and reshaping, where the first element would be the first block (0-99) rows and the second element would represent the second block with values from 200 - 299 rows off the input array.
The implementation would be -
a[:300].reshape(3,-1,a.shape[1])[::2]
Sample run with input array of shape (20,5)
as we would try to get rows (0-5)
and (10-15)
-
1) Input array :
In [364]: a
Out[364]:
array([[6, 2, 3, 4, 7],
[4, 7, 7, 4, 7],
[3, 5, 6, 2, 1],
[0, 6, 7, 4, 8],
[1, 5, 8, 6, 7],
[6, 3, 3, 3, 3],
[1, 6, 1, 3, 5],
[6, 8, 4, 7, 6],
[8, 4, 6, 8, 7],
[4, 8, 3, 5, 2],
[4, 6, 7, 0, 8],
[7, 1, 6, 0, 7],
[1, 5, 5, 4, 4],
[3, 4, 8, 4, 7],
[0, 4, 5, 0, 5],
[2, 6, 8, 2, 4],
[5, 6, 2, 5, 0],
[6, 2, 4, 2, 7],
[3, 1, 6, 8, 4],
[0, 4, 3, 2, 0]])
2) Use proposed slicing and reshaping to get us a 3D array :
In [365]: a[:15].reshape(3,-1,a.shape[1])[::2]
Out[365]:
array([[[6, 2, 3, 4, 7],
[4, 7, 7, 4, 7],
[3, 5, 6, 2, 1],
[0, 6, 7, 4, 8],
[1, 5, 8, 6, 7]],
[[4, 6, 7, 0, 8],
[7, 1, 6, 0, 7],
[1, 5, 5, 4, 4],
[3, 4, 8, 4, 7],
[0, 4, 5, 0, 5]]])
3) Verify output with manual slicing :
In [366]: a[:5]
Out[366]:
array([[6, 2, 3, 4, 7],
[4, 7, 7, 4, 7],
[3, 5, 6, 2, 1],
[0, 6, 7, 4, 8],
[1, 5, 8, 6, 7]])
In [367]: a[10:15]
Out[367]:
array([[4, 6, 7, 0, 8],
[7, 1, 6, 0, 7],
[1, 5, 5, 4, 4],
[3, 4, 8, 4, 7],
[0, 4, 5, 0, 5]])
4) Finally, the most important part to verify that it's a view indeed :
In [368]: np.shares_memory(a, a[:15].reshape(3,-1,a.shape[1])[::2])
Out[368]: True
5) We could of course reshape it afterwards to get a 2D output, but that forces a copy there -
In [371]: a[:15].reshape(3,-1,a.shape[1])[::2].reshape(-1,a.shape[1])
Out[371]:
array([[6, 2, 3, 4, 7],
[4, 7, 7, 4, 7],
[3, 5, 6, 2, 1],
[0, 6, 7, 4, 8],
[1, 5, 8, 6, 7],
[4, 6, 7, 0, 8],
[7, 1, 6, 0, 7],
[1, 5, 5, 4, 4],
[3, 4, 8, 4, 7],
[0, 4, 5, 0, 5]])
In [372]: np.shares_memory(a, _)
Out[372]: False
Upvotes: 1