Reputation: 17
Suppose I have an array like
a = np.array([[0, 1, 1, 1, 0, 0, 0, 0, 1, 0],
[0, 0, 1, 0, 0, 1, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 1, 0, 0, 1, 0]])
I want each row to have a specific number of ones- let's say, 5 ones per row. So in the first row I need to add 1 one, second row needs 3 ones, and the third needs 2. I need to randomly generate those ones in places where x = 0.
How do I do this?
Upvotes: 1
Views: 102
Reputation: 59701
This was a bit tricky but here is a fully vectorized solution:
import numpy as np
def add_ones_up_to(data, n):
# Count number of ones to add to each row
c = np.maximum(n - np.count_nonzero(data, axis=-1), 0)
# Make row-shuffling indices
shuffle = np.argsort(np.random.random(data.shape), axis=-1)
# Row-shuffled data
data_shuffled = np.take_along_axis(data, shuffle, axis=-1)
# Sorting indices for shuffled data (indices of zeros will be first)
sorter = np.argsort(np.abs(data_shuffled), axis=-1)
# Sorted row-shuffled data
data_sort = np.take_along_axis(data_shuffled, sorter, axis=-1)
# Mask for number of ones to add
m = c[..., np.newaxis] > np.arange(data.shape[-1])
# Replace values with ones or previous value depending on mask
data_sort = np.where(m, 1, data_sort)
# Undo sorting and shuffling
reorderer = np.empty_like(sorter)
np.put_along_axis(reorderer, sorter, np.arange(reorderer.shape[-1]), axis=-1)
np.put_along_axis(reorderer, shuffle, reorderer.copy(), axis=-1)
return np.take_along_axis(data_sort, reorderer, axis=-1)
np.random.seed(100)
data = np.array([[0, 1, 1, 1, 0, 0, 0, 0, 1, 0],
[0, 0, 1, 0, 0, 1, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 1, 0, 0, 1, 0]])
n = 5
print(add_ones_up_to(data, n))
# [[0 1 1 1 1 0 0 0 1 0]
# [0 1 1 1 0 1 0 1 0 0]
# [1 0 0 0 0 1 1 0 1 1]]
Upvotes: 1
Reputation: 571
import numpy as np
a = np.array([[0, 1, 1, 1, 0, 0, 0, 0, 1, 0],
[0, 0, 1, 0, 0, 1, 0, 0, 0, 0],
[1, 0, 0, 0, 0, 1, 0, 0, 1, 0]])
ones = 5
to_add = ones - np.count_nonzero(a, axis=1)
for i in range(a.shape[0]):
idx = np.random.choice(np.flatnonzero(a[i, :] == 0), size=to_add[i], replace=False)
a[i, idx] = 1
For each row you count the numbers of non zeros to calculate how many ones to add. You than chose that many indices out of the set of indices where a is zero and set those to 1.
Upvotes: 1