Reputation: 918
First I create my array
myarray = np.random.random_integers(0,10, size=20)
Then, I want to set 20% of the elements in the array to 0 (or some other number). How should I do this? Apply a mask?
Upvotes: 20
Views: 10377
Reputation: 4525
Assume your input numpy array is A
and p=0.2
. The following are a couple of ways to achieve this.
ones = np.ones(A.size)
idx = int(min(p*A.size, A.size))
ones[:idx] = 0
A *= np.reshape(np.random.permutation(ones), A.shape)
This is commonly done in several denoising objectives, most notably the Masked Language Modeling in Transformers pre-training. Here is a more pythonic way of setting a certain proportion (say 20%) of elements to zero.
A *= np.random.binomial(size=A.shape, n=1, p=0.8)
Another Alternative:
A *= np.random.randint(0, 2, A.shape)
Upvotes: 0
Reputation: 331
For others looking for the answer in case of nd-array, as proposed by user holi:
my_array = np.random.rand(8, 50)
indices = np.random.choice(my_array.shape[1]*my_array.shape[0], replace=False, size=int(my_array.shape[1]*my_array.shape[0]*0.2))
We multiply the dimensions to get an array of length dim1*dim2, then we apply this indices to our array:
my_array[np.unravel_index(indices, my_array.shape)] = 0
The array is now masked.
Upvotes: 4
Reputation: 85442
You can calculate the indices with np.random.choice
, limiting the number of chosen indices to the percentage:
indices = np.random.choice(np.arange(myarray.size), replace=False,
size=int(myarray.size * 0.2))
myarray[indices] = 0
Upvotes: 24
Reputation: 895
Use np.random.permutation
as random index generator, and take the first 20% of the index.
myarray = np.random.random_integers(0,10, size=20)
n = len(myarray)
random_idx = np.random.permutation(n)
frac = 20 # [%]
zero_idx = random_idx[:round(n*frac/100)]
myarray[zero_idx] = 0
Upvotes: 1
Reputation: 466
If you want the 20% to be random:
random_list = []
array_len = len(myarray)
while len(random_list) < (array_len/5):
random_int = math.randint(0,array_len)
if random_int not in random_list:
random_list.append(random_int)
for position in random_list:
myarray[position] = 0
return myarray
This would ensure you definitely get 20% of the values, and RNG rolling the same number many times would not result in less than 20% of the values being 0.
Upvotes: 0