NumpyNoob
NumpyNoob

Reputation: 17

Generate a specific number of ones for each row, but only where x = zeroes

Suppose I have an array like

a = np.array([[0, 1, 1, 1, 0, 0, 0, 0, 1, 0],
             [0, 0, 1, 0, 0, 1, 0, 0, 0, 0],
             [1, 0, 0, 0, 0, 1, 0, 0, 1, 0]])

I want each row to have a specific number of ones- let's say, 5 ones per row. So in the first row I need to add 1 one, second row needs 3 ones, and the third needs 2. I need to randomly generate those ones in places where x = 0.

How do I do this?

Upvotes: 1

Views: 102

Answers (2)

javidcf
javidcf

Reputation: 59701

This was a bit tricky but here is a fully vectorized solution:

import numpy as np

def add_ones_up_to(data, n):
    # Count number of ones to add to each row
    c = np.maximum(n - np.count_nonzero(data, axis=-1), 0)
    # Make row-shuffling indices
    shuffle = np.argsort(np.random.random(data.shape), axis=-1)
    # Row-shuffled data
    data_shuffled = np.take_along_axis(data, shuffle, axis=-1)
    # Sorting indices for shuffled data (indices of zeros will be first)
    sorter = np.argsort(np.abs(data_shuffled), axis=-1)
    # Sorted row-shuffled data
    data_sort = np.take_along_axis(data_shuffled, sorter, axis=-1)
    # Mask for number of ones to add
    m = c[..., np.newaxis] > np.arange(data.shape[-1])
    # Replace values with ones or previous value depending on mask
    data_sort = np.where(m, 1, data_sort)
    # Undo sorting and shuffling
    reorderer = np.empty_like(sorter)
    np.put_along_axis(reorderer, sorter, np.arange(reorderer.shape[-1]), axis=-1)
    np.put_along_axis(reorderer, shuffle, reorderer.copy(), axis=-1)
    return np.take_along_axis(data_sort, reorderer, axis=-1)

np.random.seed(100)
data = np.array([[0, 1, 1, 1, 0, 0, 0, 0, 1, 0],
                 [0, 0, 1, 0, 0, 1, 0, 0, 0, 0],
                 [1, 0, 0, 0, 0, 1, 0, 0, 1, 0]])
n = 5
print(add_ones_up_to(data, n))
# [[0 1 1 1 1 0 0 0 1 0]
#  [0 1 1 1 0 1 0 1 0 0]
#  [1 0 0 0 0 1 1 0 1 1]]

Upvotes: 1

Alex bGoode
Alex bGoode

Reputation: 571

import numpy as np

a = np.array([[0, 1, 1, 1, 0, 0, 0, 0, 1, 0],
             [0, 0, 1, 0, 0, 1, 0, 0, 0, 0],
             [1, 0, 0, 0, 0, 1, 0, 0, 1, 0]])

ones = 5

to_add = ones - np.count_nonzero(a, axis=1)

for i in range(a.shape[0]):

    idx = np.random.choice(np.flatnonzero(a[i, :] == 0), size=to_add[i], replace=False)

    a[i, idx] = 1

For each row you count the numbers of non zeros to calculate how many ones to add. You than chose that many indices out of the set of indices where a is zero and set those to 1.

Upvotes: 1

Related Questions