Reputation: 132

Numpy, sort rows of a matrix putting zeros first and not modifying the rest of the row

I have a matrix in numpy, that is a NxM ndarray that looks like the following one:

[
  [ 0, 5, 11, 22, 0, 0, 11, 22], 
  [ 1, 4, 11, 20, 0, 4, 11, 20], 
  [ 1, 6, 11, 22, 0, 1, 11, 22], 
  [ 4, 7, 12, 21, 0, 4, 12, 21], 
  [ 5, 7, 12, 22, 0, 7, 12, 22], 
  [ 5, 7, 12, 22, 0, 5, 12, 22]
]

I would like to sort it by rows putting the zeros in each row first without changing the order of the other elements along the row.

My desired output is the following:

[
  [ 0, 0, 0, 5, 11, 22, 11, 22], 
  [ 0, 1, 4, 11, 20, 4, 11, 20], 
  [ 0, 1, 6, 11, 22, 1, 11, 22], 
  [ 0, 4, 7, 12, 21, 4, 12, 21], 
  [ 0, 5, 7, 12, 22, 7, 12, 22], 
  [ 0, 5, 7, 12, 22, 5, 12, 22]
]

For a matter of efficiency I am required to do it using numpy (so switching to Python's regular nested lists and doing calculations on them is discouraged). The faster the code, the better.

How could I do that?

Best, Andrea

Upvotes: 4

Answers (4)

Jaime

Reputation: 67467

It is possible to get rid of all the Python looping, building a boolean mask with the help of np.tile and np.repeat, although you will have to time it on some larger example to see if it is worth the extra complexity:

rows, cols = a.shape
mask = a != 0
nonzeros_per_row = mask.sum(axis=1)
repeats = np.column_stack((cols-nonzeros_per_row, nonzeros_per_row)).ravel()
new_mask = np.repeat(np.tile([False, True], rows), repeats).reshape(rows, cols)
out = np.zeros_like(a)
out[new_mask] = a[mask]

>>> a
array([[ 0,  5, 11, 22,  0,  0, 11, 22],
       [ 1,  4, 11, 20,  0,  4, 11, 20],
       [ 1,  6, 11, 22,  0,  1, 11, 22],
       [ 4,  7, 12, 21,  0,  4, 12, 21],
       [ 5,  7, 12, 22,  0,  7, 12, 22],
       [ 5,  7, 12, 22,  0,  5, 12, 22]])
>>> out
array([[ 0,  0,  0,  5, 11, 22, 11, 22],
       [ 0,  1,  4, 11, 20,  4, 11, 20],
       [ 0,  1,  6, 11, 22,  1, 11, 22],
       [ 0,  4,  7, 12, 21,  4, 12, 21],
       [ 0,  5,  7, 12, 22,  7, 12, 22],
       [ 0,  5,  7, 12, 22,  5, 12, 22]])

Upvotes: 0

dabhaid

Reputation: 3879

This approach gets a binary array of where your array is zero and non-zero, then gets the sort index for that, then applies that to the original array.

You'll need an array as big as your to-be-sorted array to hold the index, but since it's all numpy operations it might be faster than looping.

ind = (a>0).astype(int)
ind = ind.argsort(axis=1)
a[np.arange(ind.shape[0])[:,None], ind]

output:

>>> a
array([[ 0,  0,  0,  5, 11, 22, 11, 22],
       [ 0,  1,  4, 11, 20,  4, 11, 20],
       [ 0,  1,  6, 11, 22,  1, 11, 22],
       [ 0,  4,  7, 12, 21,  4, 12, 21],
       [ 0,  5,  7, 12, 22,  7, 12, 22],
       [ 0,  5,  7, 12, 22,  5, 12, 22]])

Upvotes: 2

toine

Reputation: 2026

maybe not the most efficient since it loops on the line, but maybe a good starting point:

import numpy as np

a = np.array([[ 0,  5, 11, 22,  0,  0, 11, 22],
             [ 1,  4, 11, 20,  0,  4, 11, 20],
             [ 1,  6, 11, 22,  0,  1, 11, 22],
             [ 4,  7, 12, 21,  0,  4, 12, 21],
             [ 5,  7, 12, 22,  0,  7, 12, 22],
             [ 5,  7, 12, 22,  0,  5, 12, 22]])

size = a.shape[1]

for i, line in enumerate(a):
    nz = np.nonzero(a[i][:])[0]
    z = np.zeros(size - nz.shape[0])
    a[i][:] = np.concatenate((z,a[i][:][np.nonzero(a[i][:])]))

For each line in a, you find the nonzero indices and prepend some zeros to match the size.

Upvotes: 1

wim

Reputation: 363253

Is a loop over rows allowed?

>>> a
array([[ 0,  5, 11, 22,  0,  0, 11, 22],
       [ 1,  4, 11, 20,  0,  4, 11, 20],
       [ 1,  6, 11, 22,  0,  1, 11, 22],
       [ 4,  7, 12, 21,  0,  4, 12, 21],
       [ 5,  7, 12, 22,  0,  7, 12, 22],
       [ 5,  7, 12, 22,  0,  5, 12, 22]])
>>> for row in a:
...     row[:] = np.r_[row[row == 0], row[row != 0]]
...     
>>> a
array([[ 0,  0,  0,  5, 11, 22, 11, 22],
       [ 0,  1,  4, 11, 20,  4, 11, 20],
       [ 0,  1,  6, 11, 22,  1, 11, 22],
       [ 0,  4,  7, 12, 21,  4, 12, 21],
       [ 0,  5,  7, 12, 22,  7, 12, 22],
       [ 0,  5,  7, 12, 22,  5, 12, 22]])

Upvotes: 2

Numpy, sort rows of a matrix putting zeros first and not modifying the rest of the row

Answers (4)

Related Questions