NumPy arrays seem very slow; am I doing something wrong?

Question

For these two functions performing the same computation, why does the second take about eight times as long to run?

def random_walk_2(n):
    """Return coordinates after 'n' step random walk."""
    x, y = 0, 0
    for _ in range(n):
        (dx, dy) = random.choice([(0, 1), (0, -1), (1, 0), (-1, 0)])
        x += dx
        y += dy
    return (x, y)     # 6.48 seconds for 20,000

def random_walk_3(n):
    """Return coordinates after 'n' step random walk."""
    location = np.array([0, 0])
    for _ in range(n):
        move = np.array(random.choice([[0, 1], [0, -1], [1, 0], [-1, 0]]))
        location += move
    return location     # 54.8 seconds for 20,000

TIA,

Mark

hpaulj · Accepted Answer

To make full use of numpy, generate all the moves at once. Here I use numpy's' version of choice to pick 1000 moves at once. And then just sum them:

In [138]: arr = np.array([[0, 1], [0, -1], [1, 0], [-1, 0]])
In [139]: moves = np.random.choice(4, 1000)
In [140]: np.sum(arr[moves,:], axis=0)
Out[140]: array([  9, -51])

another "run":

In [141]: moves = np.random.choice(4, 1000)
In [142]: np.sum(arr[moves,:], axis=0)
Out[142]: array([30, 34])

a timing:

In [144]: timeit np.sum(arr[np.random.choice(4, 20000),:],0)
952 µs ± 190 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

NumPy arrays seem very slow; am I doing something wrong?

Answers (2)

Related Questions