Reputation: 688
I have seen answers for each part of my question. For example np.where(arr, b, c)
converts all b's in arr to c. Or arr[arr == b] = c
does the same. However, I have 1000 labels in a numpy array, labels_test
, including 1 and 6. I want to flip 30 percent of the correct labels to wrong ones to make an erroneous dataset. So I create the following list of indices that should be changed.
l = [np.random.choice(1000) for x in range(100)] (I am not sure if each index is repeated once)
I want something like
np.put(labels_test, l, if labels_test[l] ==1, then 6 and if labels_test[l] ==6, then 1`
We can do it for the following toy example:
np.random.seed(1)
labels_test = [np.random.choice([1,6]) for x in range(20)]
[6, 6, 1, 1, 6, 6, 6, 6, 6, 1, 1, 6, 1, 6, 6, 1, 1, 6, 1, 1]
Upvotes: 4
Views: 89
Reputation: 500177
Here is one way:
>>> labels_test = np.random.choice([1, 6], 20)
>>> ind = np.random.choice(labels_test.shape[0], labels_test.shape[0]//3, replace=False)
>>> labels_test
array([1, 6, 1, 1, 6, 1, 1, 1, 6, 1, 1, 1, 6, 6, 6, 6, 6, 1, 1, 1])
>>> labels_test[ind] = 7 - labels_test[ind]
>>> labels_test
array([1, 6, 1, 6, 6, 6, 1, 1, 6, 1, 6, 1, 1, 6, 1, 6, 6, 1, 1, 6])
This flips exactly 30% (rounded down to the nearest integer) by sampling without replacement. Depending on your requirements, a suitable alternative might be to select every label with probability 0.3.
Upvotes: 2