Generate a specific number of ones for each row, but only where x = zeroes

Question

Suppose I have an array like

a = np.array([[0, 1, 1, 1, 0, 0, 0, 0, 1, 0],
             [0, 0, 1, 0, 0, 1, 0, 0, 0, 0],
             [1, 0, 0, 0, 0, 1, 0, 0, 1, 0]])

I want each row to have a specific number of ones- let's say, 5 ones per row. So in the first row I need to add 1 one, second row needs 3 ones, and the third needs 2. I need to randomly generate those ones in places where x = 0.

How do I do this?

javidcf · Accepted Answer

This was a bit tricky but here is a fully vectorized solution:

import numpy as np

def add_ones_up_to(data, n):
    # Count number of ones to add to each row
    c = np.maximum(n - np.count_nonzero(data, axis=-1), 0)
    # Make row-shuffling indices
    shuffle = np.argsort(np.random.random(data.shape), axis=-1)
    # Row-shuffled data
    data_shuffled = np.take_along_axis(data, shuffle, axis=-1)
    # Sorting indices for shuffled data (indices of zeros will be first)
    sorter = np.argsort(np.abs(data_shuffled), axis=-1)
    # Sorted row-shuffled data
    data_sort = np.take_along_axis(data_shuffled, sorter, axis=-1)
    # Mask for number of ones to add
    m = c[..., np.newaxis] > np.arange(data.shape[-1])
    # Replace values with ones or previous value depending on mask
    data_sort = np.where(m, 1, data_sort)
    # Undo sorting and shuffling
    reorderer = np.empty_like(sorter)
    np.put_along_axis(reorderer, sorter, np.arange(reorderer.shape[-1]), axis=-1)
    np.put_along_axis(reorderer, shuffle, reorderer.copy(), axis=-1)
    return np.take_along_axis(data_sort, reorderer, axis=-1)

np.random.seed(100)
data = np.array([[0, 1, 1, 1, 0, 0, 0, 0, 1, 0],
                 [0, 0, 1, 0, 0, 1, 0, 0, 0, 0],
                 [1, 0, 0, 0, 0, 1, 0, 0, 1, 0]])
n = 5
print(add_ones_up_to(data, n))
# [[0 1 1 1 1 0 0 0 1 0]
#  [0 1 1 1 0 1 0 1 0 0]
#  [1 0 0 0 0 1 1 0 1 1]]

Generate a specific number of ones for each row, but only where x = zeroes

Answers (2)

Related Questions