Reputation: 4953

Replicating MATLAB's `randperm` in NumPy

I want to replicate MATLAB's randperm() with NumPy.

Currently, to get randperm(n, k) I use np.random.permutation(n)[:k]. The problem is it allocates an array of size n then takes only k entries of it.

Is there a more memory efficient way to directly create the array?

Upvotes: 3

Answers (3)

Peter O.

Reputation: 32878

numpy.random.choice(n, k, replace=False) is no more memory efficient than numpy.random.permutation(n)[:k]. It too creates an n-item temporary list, shuffles that list, and takes k items from that list. See:

Comparison of np.random.choice vs np.random.shuffle for samples without replacement

However, numpy.random.* functions, such as numpy.random.choice and numpy.random.permutation, have become legacy functions as of NumPy 1.17, and their algorithms — inefficiencies and all — are expected to remain as they are for backward compatibility reasons (see the recent RNG policy for NumPy).

Fortunately, NumPy since version 1.17 has an alternative:numpy.random.Generator.choice, which uses a much more efficient implementation, as can be seen below:

In [227]: timeit np.random.choice(4000000, 48, replace = False)                                  
163 ms ± 19.3 ms per loop (mean ± std. Dev. Of 7 runs, 1 loop each)

In [228]: timeit np.random.permutation(4000000)[:48]                                             
178 ms ± 22.5 ms per loop (mean ± std. Dev. Of 7 runs, 1 loop each)

In [229]: r=numpy.random.default_rng()                                                           

In [230]: timeit r.choice(4000000,48,replace=False)                                              
14.5 µs ± 28.9 ns per loop (mean ± std. Dev. Of 7 runs, 100000 loops each)

If you use NumPy 1.17 or later, you should make use of the new pseudorandom number generation system introduced in version 1.17, including numpy.random.Generator, in newer applications.

Upvotes: 3

TaQ

Reputation: 162

I can recommend you np.random.choice(n, k, replace = False). Yet, I am not sure about memory efficiency. Please refer to docs

Upvotes: 2

Royi

Reputation: 4953

Based on @TaQ answer:

np.random.choice(n, k, replace = False)

Is the equivalent to MATLAB's randperm().

Update: I will update his answer as well to mark it.

Upvotes: 0

Replicating MATLAB&#39;s `randperm` in NumPy

Answers (3)

Related Questions

Replicating MATLAB's `randperm` in NumPy