Reputation: 13660
I have an array
my_array = np.array([1,0,1,0,0,1,1,0,1,0])
In this array, 50% of the items are 1's. I want to efficiently and randomly switch some of the 1's to 0's so that the ratio is 20%.
new_array = switch_function(my_array)
print new_array
array([0,0,0,0,0,1,0,0,1,0]) #random switching retaining order
This seems like it should be simple and everything I thought of seems over-engineered. Thanks for the help!
Upvotes: 0
Views: 188
Reputation: 93
If you do not need to keep the existing zeros as zeros, and simply want the entire array to have an average of 20% 1's, could you not go through your array with a "for" loop, and for each element call randint(1,5). If randint returns 1, set your array item to 1, else set it to zero.
If however you wish to retain all the original zeros, that means you want to reduce the number of 1's to 40% of the number there now, so go through the array, if the number is 1,call randint(1,5) and if it returns 1 or 2, retain the original 1, else change it to zero.
Upvotes: 0
Reputation: 353059
IIUC, something like this should work:
>>> arr = np.array([0,0,0,0,0,1,1,1,1,1])
>>> want_frac = 0.2
>>> n = int(round(arr.sum() - want_frac * len(arr)))
>>> indices_to_flip = np.random.choice(arr.nonzero()[0], n, replace=False)
>>> arr[indices_to_flip] = 0
>>> arr
array([0, 0, 0, 0, 0, 0, 1, 0, 1, 0])
>>> arr.mean()
0.20000000000000001
First we figure out how many numbers we need to flip (trying our best to get close to the right value) then we randomly choose n
of the nonzero indices, and finally we set them to zero.
Note that as JFS notes in comments, you should verify that n > 0
, to make sure you don't accidentally make changes you don't intend.
Upvotes: 4
Reputation: 2206
There are many ways to accomplish this kind of task. Here is a simple approach.
# Get the array length
N = len(my_array)
# Proportion of 1's
p = np.sum(my_array) / float(N)
# Locations of 1's
idx = np.arange(0, N)[my_array == 1]
# Calculate how many idx to change
k = (p*N) - (0.2 * N)
# Sample the idx and change values to 0
my_array[np.random.choice(idx, int(k), False)] = 0
Upvotes: 2