Reputation: 153
I am facing a situation where I have a VERY large numpy.ndarray
(really, it's an hdf5 dataset) that I need to find a subset of quickly because they entire array cannot be held in memory. However, I also do not want to iterate through such an array (even declaring the built-in numpy iterator throws a MemoryError
) because my script would take literally days to run.
As such, I'm faced with the situation of iterating through some dimensions of the array so that I can perform array-operations on pared down subsets of the full array. To do that, I need to be able to dynamically slice out a subset of the array. Dynamic slicing means constructing a tuple and passing it.
For example, instead of
my_array[0,0,0]
I might use
my_array[(0,0,0,)]
Here's the problem: if I want to slice out all values along a particular dimension/axis of the array manually, I could do something like
my_array[0,:,0]
> array([1, 4, 7])
However, I this does not work if I use a tuple:
my_array[(0,:,0,)]
where I'll get a SyntaxError
.
How can I do this when I have to construct the slice dynamically to put something in the brackets of the array?
Upvotes: 4
Views: 4056
Reputation: 15889
You could slice automaticaly using python's slice
:
>>> a = np.random.rand(3, 4, 5)
>>> a[0, :, 0]
array([ 0.48054702, 0.88728858, 0.83225113, 0.12491976])
>>> a[(0, slice(None), 0)]
array([ 0.48054702, 0.88728858, 0.83225113, 0.12491976])
The slice
method reads as slice(*start*, stop[, step])
. If only one argument is passed, then it is interpreted as slice(0, stop)
.
In the example above :
is translated to slice(0, end)
which is equivalent to slice(None)
.
Other slice examples:
:5 -> slice(5)
1:5 -> slice(1, 5)
1: -> slice(1, None)
1::2 -> slice(1, None, 2)
Upvotes: 8
Reputation: 153
Okay, I finally found an answer just as someone else did.
Suppose I have array:
my_array[...]
>array(
[[[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9]],
[[10, 11, 12],
[13, 14, 15],
[16, 17, 18]]])
I can use the slice
object, which apparently is a thing:
sl1 = slice( None )
sl2 = slice( 1,2 )
sl3 = slice( None )
ad_array.matrix[(sl1, sl2, sl3)]
>array(
[[[ 4, 5, 6]],
[[13, 14, 15]]])
Upvotes: 0