s7amuser
s7amuser

Reputation: 857

NumPy complicated slicing

I have a NumPy array, for example:

>>> import numpy as np
>>> x = np.random.randint(0, 10, size=(5, 5))
>>> x
array([[4, 7, 3, 7, 6],
       [7, 9, 5, 7, 8],
       [3, 1, 6, 3, 2],
       [9, 2, 3, 8, 4],
       [0, 9, 9, 0, 4]])

Is there a way to get a view (or copy) that contains indices 1:3 of the first row, indices 2:4 of the second row and indices 3:5 of the forth row? So, in the above example, I wish to get:

>>> # What to write here?
array([[7, 3],
      [5, 7],
      [8, 4]])

Obviously, I would like a general method that would work efficiently also for multi-dimensional large arrays (and not only for the toy example above).

Upvotes: 0

Views: 492

Answers (6)

Sacry
Sacry

Reputation: 555

Someone has already pointed out the as_strided tricks, and yes, you should really use it with caution.

Here is a broadcast / fancy index approach which is less efficient than as_strided but still works pretty well IMO

window_size, step_size = 2, 1

# index within window
index = np.arange(2)

# offset
offset = np.arange(1, 4, step_size)

# for your case it's [0, 1, 3], I'm not sure how to generalize it without further information
fancy_row = np.array([0, 1, 3]).reshape(-1, 1)

# array([[1, 2],
#        [2, 3],
#        [3, 4]])
fancy_col = offset.reshape(-1, 1) + index

x[fancy_row, fancy_col]

Upvotes: 0

hpaulj
hpaulj

Reputation: 231728

For the general case it will be hard to beat a row by row list comprehension:

In [28]: idx = np.array([[0,1,3],[1,2,4],[4,3,5]])
In [29]: [x[i,j:k] for i,j,k in idx]
Out[29]: [array([7, 8]), array([2, 0]), array([9, 2])]

If the resulting arrays are all the same size, they can be combined into one 2d array:

In [30]: np.array(_)
Out[30]: 
array([[7, 8],
       [2, 0],
       [9, 2]])

Another approach is to concatenate the indices before. I won't get into the details, but create something like this:

In [27]: x[[0,0,1,1,3,3],[1,2,2,3,3,4]]
Out[27]: array([7, 8, 2, 0, 3, 8])

Selecting from different rows complicates this 2nd approach. Conceptually the first is simpler. Past experience suggests the speed is about the same.

For uniform length slices, something like the as_strided trick may be faster, but it requires more understanding.

Some masking based approaches have also been suggested. But the details are more complicated, so I'll leave those to people like @Divakar who have specialized in them.

Upvotes: 0

Mad Physicist
Mad Physicist

Reputation: 114578

You can use numpy.lib.stride_tricks.as_strided as long as the offsets between rows are uniform:

# How far to step along the rows
offset = 1
# How wide the chunk of each row is
width = 2
view = np.lib.stride_tricks.as_strided(x, shape=(x.shape[0], width), strides=(x.strides[0] + offset * x.strides[1],) + x.strides[1:])

The result is guaranteed to be a view into the original data, not a copy.

Since as_strided is ridiculously powerful, be very careful how you use it. For example, make absolutely sure that the view does not go out of bounds in the last few rows.

If you can avoid it, try not to assign anything into a view returned by as_strided. Assignment just increases the dangers of unpredictable behavior and crashing a thousandfold if you don't know exactly what you're doing.

Upvotes: 2

kuppern87
kuppern87

Reputation: 1135

I would extract diagonal vectors and stack them together, like this:

def diag_slice(x, start, end):
    n_rows = min(*x.shape)-end+1
    columns = [x.diagonal(i)[:n_rows, None] for i in range(start, end)]
    return np.hstack(columns)

In [37]: diag_slice(x, 1, 3)
Out[37]: 
array([[7, 3],
       [5, 7],
       [3, 2]])

Upvotes: 0

alx
alx

Reputation: 844

I guess something like this :D

In:

import numpy as np
x = np.random.randint(0, 10, size=(5, 5))
Out:

array([[7, 3, 3, 1, 9],
       [6, 1, 3, 8, 7],
       [0, 2, 2, 8, 4],
       [8, 8, 1, 8, 8],
       [1, 2, 4, 3, 4]])
In:

list_of_indicies = [[0,1,3], [1,2,4], [3,3,5]] #[row, start, stop]

def func(array, row, start, stop):
    return array[row, start:stop]

for i in range(len(list_of_indicies)):
    print(func(x,list_of_indicies[i][0],list_of_indicies[i][1], list_of_indicies[i][2]))

Out:

[3 3]
[3 8]
[3 4]

So u can modify it for your needs. Good luck!

Upvotes: 0

Fomalhaut
Fomalhaut

Reputation: 9825

Try:

>>> np.array([x[0, 1:3], x[1, 2:4], x[3, 3:5]])
array([[7, 3],
       [5, 7],
       [8, 4]])

Upvotes: 2

Related Questions