Midnighter
Midnighter

Reputation: 3881

Is there a faster version of numpy.random.shuffle?

I'm using numpy.random.shuffle in order to compute a statistic on randomized columns of a 2D array. The Python code is as follows:

import numpy as np

def timeline_sample(series, num):
    random = series.copy()
    for i in range(num):
        np.random.shuffle(random.T)
        yield random

The speed I get is something like this:

import numpy as np
arr = np.random.sample((50, 5000))

%%timeit
for series in timeline_sample(rnd, 100):
    np.sum(series)
1 loops, best of 3: 391 ms per loop

I tried to Cythonize this function but I wasn't sure how to replace the call to np.random.shuffle and the function was 3x slower. Does anyone know how to accelerate or replace this? It is currently the bottleneck in my program.

Cython code:

cimport cython

import numpy as np
cimport numpy as np


@cython.boundscheck(False)
@cython.wraparound(False)
def timeline_sample2(double[:, ::1] series, int num):
    cdef double[:, ::1] random = series.copy()
    cdef int i
    for i in range(num):
        np.random.shuffle(random.T)
        yield random

Upvotes: 4

Views: 8264

Answers (2)

Veedrac
Veedrac

Reputation: 60127

It's likely that this will give a nice speed boost:

from timeit import Timer

import numpy as np
arr = np.random.sample((50, 5000))

def timeline_sample(series, num):
    random = series.copy()
    for i in range(num):
        np.random.shuffle(random.T)
        yield random

def timeline_sample_fast(series, num):
    random = series.T.copy()
    for i in range(num):
        np.random.shuffle(random)
        yield random.T

def timeline_sample_faster(series, num):
    length = arr.shape[1]
    for i in range(num):
        yield series[:, np.random.permutation(length)]

def consume(iterable):
    for s in iterable:
        np.sum(s)

min(Timer(lambda: consume(timeline_sample(arr, 1))).repeat(10, 10))
min(Timer(lambda: consume(timeline_sample_fast(arr, 1))).repeat(10, 10))
min(Timer(lambda: consume(timeline_sample_faster(arr, 1))).repeat(10, 10))
#>>> 0.2585161680035526
#>>> 0.2416607110062614
#>>> 0.04835709399776533

Forcing it to be contiguous does increase the time, but not by a ton:

def consume(iterable):
    for s in iterable:
        np.sum(np.ascontiguousarray(s))

min(Timer(lambda: consume(timeline_sample(arr, 1))).repeat(10, 10))
min(Timer(lambda: consume(timeline_sample_fast(arr, 1))).repeat(10, 10))
min(Timer(lambda: consume(timeline_sample_faster(arr, 1))).repeat(10, 10))
#>>> 0.2632228760048747
#>>> 0.25778737501241267
#>>> 0.07451769898761995

Upvotes: 6

ryanpattison
ryanpattison

Reputation: 6251

Randomizing rows will be cheaper, the code below is equivalent in functionality but is about 3 times faster on my machine.

def timeline_sample_fast(series, num):
   random = series.T.copy()
   for i in range(num):
       np.random.shuffle(random)
       yield random.T



arr = np.random.sample((600, 50))

%%timeit                         
for s in timeline_sample(arr, 100):
    np.sum(s)

10 loops, best of 3: 55.5 ms per loop

%%timeit
for s in timeline_sample_fast(arr, 100):
   np.sum(s)

10 loops, best of 3: 18.6 ms per loop

Upvotes: 1

Related Questions