np.shuffle much slower than np.random.choice

Question

I have an array of shape (N, 3) and I'd like to randomly shuffle the rows. N is on the order of 100,000.

I discovered that np.random.shuffle was bottlenecking my application. I tried replacing the shuffle with a call to np.random.choice and experienced a 10x speed-up. What's going on here? Why is it so much faster to call np.random.choice? Does the np.random.choice version generate a uniformly distributed shuffle?

import timeit

task_choice = '''
N = 100000
x = np.zeros((N, 3))
inds = np.random.choice(N, N, replace=False)
x[np.arange(N), :] = x[inds, :]
'''

task_shuffle = '''
N = 100000
x = np.zeros((N, 3))
np.random.shuffle(x)
'''

task_permute = '''
N = 100000
x = np.zeros((N, 3))
x = np.random.permutation(x)
'''

setup = 'import numpy as np'

timeit.timeit(task_choice, setup=setup, number=10)
>>> 0.11108078400138766

timeit.timeit(task_shuffle, setup=setup, number=10)
>>> 1.0411593900062144

timeit.timeit(task_permute, setup=setup, number=10)
>>> 1.1140159380011028

Edit: For anyone curious, I decided to go with the following solution since it is readable and outperformed all other methods in my benchmarks:

task_ind_permute = '''
N = 100000
x = np.zeros((N, 3))
inds = np.random.permutation(N)
x[np.arange(N), :] = x[inds, :]
'''

CT Zhu · Accepted Answer

permutation and shuffle are linked, in fact permutation calls shuffle under the hood!!

The reason why shuffle is slower than permutation for multidimensional array is that permutation only need to shuffle the index along the first axis. Thus becomes a special case of shuffle of 1d array (the 1st if-else block).

This special case is also explained in the source as well:

# We trick gcc into providing a specialized implementation for
# the most common case, yielding a ~33% performance improvement.
# Note that apparently, only one branch can ever be specialized.

For shuffle on the otherhand, multidimensional ndarray operation requires a bounce buffer, creating that buffer, especially when the dimension is relative big, becomes expensive. Additionally, we can no longer use the trick mentioned above that helps the 1d case.

With replace=False and using choice to generate a new array of the same size, choice and permutation is the same, see here. The extra time would have to come from the time spend in creating intermediate index arrays.

np.shuffle much slower than np.random.choice

Answers (2)

Related Questions