Why is numpy ndarray much slower than lists for simple loops?

Question

I'm new to Python so I've decided to solve some common challenges to improve my knowledge of the language. I learned about numpy and its efficient ndarrays so I attempted the following experiment:

Consider the 2 sum problem (e.g. here) and let's solve it the naive way (it doesn't matter for the purpose of this question). Here's a solution with python's lists:

from  itertools import combinations

def twosum1(n_lst):
    pairs=list(combinations(n_lst,2))
    solutions=[]
    for pair in pairs:
        if sum(pair)==7: solutions.append(pair)
    return(solutions)

Then I created a version using np.arrays expecting it will drastically speed up the calculation:

from  itertools import combinations
import numpy as np

def twosum2(n_lst):
    pairs=np.array(list(combinations(n_lst,2)),dtype=int)
    return pairs[pairs[:,1]+pairs[:,0]==7]

However, after timing the two functions, twosum2 is about 2x slower than twosum1. So I thought that the problem maybe in the dynamical selection of elements, so I've written an exact copy of twosum1 by replacing lists with ndarrays ...

def twosum3(n_lst):
    pairs=np.array(list(combinations(n_lst,2)))
    solutions=np.empty((0,2))
    for pair in pairs:
        if np.sum(pair)==7: 
            solutions=np.append(solutions,[pair],axis=0)
    return(solutions)

... and the resulting function was 10x slower than the original!

How is this possible? What I'm I doing wrong here? Clearly, removing loops and replacing lists with ndarrays is not enough to gain speed (contrary to what I learned reading this).

Edit:

I use %timeit in jupyter to time the functions.
I take identical benchmarks for all the functions I'm timing.
The fact that I calculate combinations in the same way in the 3 functions tells me that the slowing down is due to numpy ... but don't see how.

B. M. · Accepted Answer

The costly operation is np.array(list(combinations(n_lst,2)),dtype=int) because python must scan each member of the list, check if member is 'int compatible', convert it in integer and store it in the array.

To reach numpy performance, you must conceive all the algorithm in numpy. For example :

In [63]: n_lst=list(range(100))

In [64]: %timeit twosum1(n_lst)
11.2 ms ± 1.64 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [65]: np.vstack(np.where(np.add.outer(n_lst,n_lst)==7)).T
Out[65]: 
array([[0, 7],
       [1, 6],
       [2, 5],
       [3, 4],
       [4, 3],
       [5, 2],
       [6, 1],
       [7, 0]], dtype=int64)

In [66]: %timeit np.vstack(np.where(np.add.outer(n_lst,n_lst)==7)).T
306 µs ± 19 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

This way you will win a 30 to 100 factor , depending of the problem.

Why is numpy ndarray much slower than lists for simple loops?

Answers (1)

Related Questions