Randomly remove and save elements from matrix

Question

I wrote this code snippet in Python:

def remove_randomly(data, percentage):
    test_list = []
    np.random.shuffle(data)
    for i in range(data.shape[0]):
        for j in range(data.shape[1]):
            roll = np.random.randint(low=1, high=100)
            if roll > percentage:
                test_list.append((i, j, data[i, j]))
                data[i, j] = 0

It gets matrix data and a number percentage, iterates over the entire matrix, and zeros (100 - percentage) of the elements and saving them to another object called test_list.

Is there a better, more efficient way to achieve this result? I've heard nested loops are bad for your health. Plus, my data matrix happens to be huge, so iterating with for loops is very slow.

Example

Suppose data is the matrix [1, 2; 3, 4] and percentage is 25%.

Then I would like the output to be (for example) data = [1, 2; 0, 4] and test_list = [(1, 0, 3)]

ForceBru · Accepted Answer

Here's what you can do:

def remove_randomly(data, percent):
    np.random.shuffle(data)
    roll = np.random.randint(1, 100, data.shape) # array of random integers with the same shape as data
    indices = np.where(roll > percent) # indices of elements in `roll` that are greater than the percentage 

    test_list = data[indices]
    data[indices] = 0

    return indices, test_list # return indices and the values

Note that np.random.randint(1, 100) will only generate random integers in range [1, 100), so that 100% will never be generated.

Randomly remove and save elements from matrix

Answers (1)

Related Questions