Ahmed Khalf
Ahmed Khalf

Reputation: 156

Move position of row in Numpy 2d array

Given Array:

np.array([[1, 2],
          [3, 4],
          [5, 6],
          [7, 8],
          [9, 10]])

If I want to move row at indice 1 to indice 3. The output should be:

[[1, 2],
 [5, 6],
 [7, 8],
 [3, 4],
 [9, 10]]

If I want to move row at indice 4 to indice 1. The output should be:

[[1, 2],
 [9, 10],
 [3, 4],
 [5, 6],
 [7, 8]]

What is the fastest way to do this moving operation?

Upvotes: 3

Views: 2090

Answers (3)

Hugues
Hugues

Reputation: 3170

I like the solution by @Mercury but found that it is even faster to avoid the array copy associated with np.roll() and instead operate entirely in-place:

    if index < index2:
      a[index:index2], a[index2] = a[index + 1:index2 + 1], a[index].copy()
    elif index2 < index:
      a[index2 + 1:index + 1], a[index2] = a[index2:index], a[index].copy()

Upvotes: 1

norok2
norok2

Reputation: 26896

What about tuple() indexing on first axis?

E.g.:

arr[(0, 2, 3, 1, 4), :]

and:

arr[(0, 4, 1, 2, 3), :]

for your expected outputs, respectively.


For a way of generating the indices starting from the two indices you could use the following:

def inner_roll(arr, first, last, axis):
    stop = last + 1
    indices = list(range(arr.shape[axis]))
    indices.insert(first, last)
    indices.pop(last + 1)
    slicing = tuple(
        slice(None) if i != axis else indices
        for i, d in enumerate(arr.shape))
    return arr[slicing]

For inputs that are relatively small along the axis on which you are operating (such as for the input in the question) this is quite fast.

Comparing it with a slightly polished version of @Mercury's answer to wrap it in a function and to make it work correctly for arbitrary axis:

import numpy as np


def inner_roll2(arr, first, last, axis):
    if first > last:
        first, last = last, first
        shift = 1
    else:
        shift = -1
    slicing = tuple(
        slice(None) if i != axis else slice(first, last + 1)
        for i, d in enumerate(arr.shape))
    arr[slicing] = np.roll(arr[slicing], shift=shift, axis=axis)
    return arr

and getting some timings:

funcs = inner_roll, inner_roll2
for n in (5, 50, 500):
    for m in (2, 20, 200):
        arr = np.arange(n * m).reshape((n, m))
        print(f'({n:<3d}, {m:<3d})', end='    ')
        for func in funcs:
            results = %timeit -o -q func(arr, 1, 2, 0)
            print(f'{func.__name__:>12s}  {results.best* 1e6:>7.3f} µs', end='    ')
        print()
# (5  , 2  )      inner_roll    5.613 µs     inner_roll2   15.393 µs    
# (5  , 20 )      inner_roll    5.592 µs     inner_roll2   15.468 µs    
# (5  , 200)      inner_roll    5.916 µs     inner_roll2   15.815 µs    
# (50 , 2  )      inner_roll   10.117 µs     inner_roll2   15.517 µs    
# (50 , 20 )      inner_roll   10.360 µs     inner_roll2   15.505 µs    
# (50 , 200)      inner_roll   12.067 µs     inner_roll2   15.886 µs    
# (500, 2  )      inner_roll   55.833 µs     inner_roll2   15.409 µs    
# (500, 20 )      inner_roll   57.364 µs     inner_roll2   15.319 µs    
# (500, 200)      inner_roll  194.408 µs     inner_roll2   15.731 µs    

This indicate that inner_roll() is the fastest approach for your inputs. However, inner_roll2() seems to scale much better with input sizes, and for even modest input sizes, this is already faster than inner_roll().

Note that, while inner_roll() creates a copy, inner_roll2() works in-place (modifying the input arr). This behavior can be modified by adding arr = arr.copy() at the beginning of the body of inner_roll2(), which would make that function slower (of course) and its timings would then be much more affected by the value of m (the size of the non-rolled axes).

On the other hand, if you were to do multiple consecutive rolling operations, inner_roll2() timings would just stack up, while for inner_roll() you only need to do the expensive part once.

Upvotes: 1

Mercury
Mercury

Reputation: 4171

If you look carefully, if you want to put row i at position j, then only rows in the range between i and j are affected; rows outside do not need to be changed. And this change is basically a roll operation. For items a,b,c,d,e, putting item at i=1 to j=3 means that b,c,d will become c,d,b, giving us a,c,d,b,e. The shift is either -1 or +1 depending on if i<j.

i, j = 1,3
i, j, s = (i, j, -1) if i<j else (j, i, 1)
arr[i:j+1] = np.roll(arr[i:j+1],shift=s,axis=0)

Upvotes: 1

Related Questions