Reputation: 13
I have an array of shape (N, 3) and I'd like to randomly shuffle the rows. N is on the order of 100,000.
I discovered that np.random.shuffle was bottlenecking my application. I tried replacing the shuffle with a call to np.random.choice and experienced a 10x speed-up. What's going on here? Why is it so much faster to call np.random.choice? Does the np.random.choice version generate a uniformly distributed shuffle?
import timeit
task_choice = '''
N = 100000
x = np.zeros((N, 3))
inds = np.random.choice(N, N, replace=False)
x[np.arange(N), :] = x[inds, :]
'''
task_shuffle = '''
N = 100000
x = np.zeros((N, 3))
np.random.shuffle(x)
'''
task_permute = '''
N = 100000
x = np.zeros((N, 3))
x = np.random.permutation(x)
'''
setup = 'import numpy as np'
timeit.timeit(task_choice, setup=setup, number=10)
>>> 0.11108078400138766
timeit.timeit(task_shuffle, setup=setup, number=10)
>>> 1.0411593900062144
timeit.timeit(task_permute, setup=setup, number=10)
>>> 1.1140159380011028
Edit: For anyone curious, I decided to go with the following solution since it is readable and outperformed all other methods in my benchmarks:
task_ind_permute = '''
N = 100000
x = np.zeros((N, 3))
inds = np.random.permutation(N)
x[np.arange(N), :] = x[inds, :]
'''
Upvotes: 0
Views: 1310
Reputation: 54340
permutation
and shuffle
are linked, in fact permutation
calls shuffle
under the hood!!
The reason why shuffle
is slower than permutation
for multidimensional array is that permutation
only need to shuffle
the index along the first axis. Thus becomes a special case of shuffle
of 1d array (the 1st if-else block).
This special case is also explained in the source as well:
# We trick gcc into providing a specialized implementation for
# the most common case, yielding a ~33% performance improvement.
# Note that apparently, only one branch can ever be specialized.
For shuffle
on the otherhand, multidimensional ndarray operation requires a bounce buffer, creating that buffer, especially when the dimension is relative big, becomes expensive. Additionally, we can no longer use the trick mentioned above that helps the 1d case.
With replace=False
and using choice
to generate a new array of the same size, choice
and permutation
is the same, see here. The extra time would have to come from the time spend in creating intermediate index arrays.
Upvotes: 1
Reputation: 51155
You're comparing very different sized arrays here. In your first example, although you create an array of zeros, you simply use random.choice(100000, 100000)
, which pulls 100000 random values between 1-100000. In your second example your are shuffling a (100000, 3)
shape array.
>>> x.shape
(100000, 3)
>>> np.random.choice(N, N, replace=False).shape
(100000,)
Timings on more equivalent samples:
In [979]: %timeit np.random.choice(N, N, replace=False)
2.6 ms ± 201 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [980]: x = np.arange(100000)
In [981]: %timeit np.random.shuffle(x)
2.29 ms ± 67.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [982]: x.shape == np.random.choice(N, N, replace=False).shape
Out[982]: True
Upvotes: 2