Chris
Chris

Reputation: 13660

Randomly re-assign 1's to 0's to reach a specified ratio

I have an array

my_array = np.array([1,0,1,0,0,1,1,0,1,0])

In this array, 50% of the items are 1's. I want to efficiently and randomly switch some of the 1's to 0's so that the ratio is 20%.

new_array = switch_function(my_array)
print new_array 

array([0,0,0,0,0,1,0,0,1,0]) #random switching retaining order

This seems like it should be simple and everything I thought of seems over-engineered. Thanks for the help!

Upvotes: 0

Views: 188

Answers (3)

John of York
John of York

Reputation: 93

If you do not need to keep the existing zeros as zeros, and simply want the entire array to have an average of 20% 1's, could you not go through your array with a "for" loop, and for each element call randint(1,5). If randint returns 1, set your array item to 1, else set it to zero.

If however you wish to retain all the original zeros, that means you want to reduce the number of 1's to 40% of the number there now, so go through the array, if the number is 1,call randint(1,5) and if it returns 1 or 2, retain the original 1, else change it to zero.

Upvotes: 0

DSM
DSM

Reputation: 353059

IIUC, something like this should work:

>>> arr = np.array([0,0,0,0,0,1,1,1,1,1])
>>> want_frac = 0.2
>>> n = int(round(arr.sum() - want_frac * len(arr)))
>>> indices_to_flip = np.random.choice(arr.nonzero()[0], n, replace=False)
>>> arr[indices_to_flip] = 0
>>> arr
array([0, 0, 0, 0, 0, 0, 1, 0, 1, 0])
>>> arr.mean()
0.20000000000000001

First we figure out how many numbers we need to flip (trying our best to get close to the right value) then we randomly choose n of the nonzero indices, and finally we set them to zero.

Note that as JFS notes in comments, you should verify that n > 0, to make sure you don't accidentally make changes you don't intend.

Upvotes: 4

Ellis Valentiner
Ellis Valentiner

Reputation: 2206

There are many ways to accomplish this kind of task. Here is a simple approach.

# Get the array length
N = len(my_array)

# Proportion of 1's
p = np.sum(my_array) / float(N)

# Locations of 1's
idx = np.arange(0, N)[my_array == 1]

# Calculate how many idx to change
k = (p*N) - (0.2 * N)

# Sample the idx and change values to 0
my_array[np.random.choice(idx, int(k), False)] = 0

Upvotes: 2

Related Questions