Indexing numpy array as slow as python append

Question

I have always been told that python's native append is a slow function and should be avoided in for loops. However after a couple of small tests I have found it performs more poorly than a numpy array when iterating over it with a for loop:

First Test -array/list construction

python native list append

def pythonAppend(n):
      x = []
      for i in range(n):
          x.append(i)
      return x

%timeit pythonAppend(1000000)

Numpy allocate array then access

 def  numpyConstruct(n):
     x = np.zeros(n)
     for i in range(n):
         x[i] = i
     return x

%timeit numpyConstruct(1000000)

Results:

Python Time: 179 ms

Numpy Time: 189 ms

Second Test -Accesing Elements

n = 1000000
x = pythonAppend(n)
arr = numpyConstruct(n)
order = arr[:]; np.random.shuffle(order); order = list(order.astype(int))

def listAccess(x):
     for i in range(len(x)):
             x[i] = i/3.
     return x

def listAccessOrder(x, order):
     for i in order:
             x[i] = i/3.
     return x

%timeit listAccess(x)
%timeit listAccess(arr)
%timeit listAccessOrder(x, order)
%timeit listAccessOrder(arr, order)
%timeit arr / 3.

Results

python -sequential: 175 ms

numpy -sequential: 198 ms

python -shuffled access: 2.08 s

numpy -shuffled access: 2.15 s

numpy -vectorised access: 1.92ms

These results to me are quite surprising as I would have thought that at least due to python being linked lists accessing elements would be slower than numpy due to having to follow a chain of pointers. Or am I misunderstanding the python list implementation? Also why do the python lists perform slightly better than the numpy equivalents -I guess most of the inefficiency comes from using a python for loop but python's append still out-competes numpy accessing and allocating.

hpaulj · Accepted Answer

Python lists are not linked lists. The data buffer of the list object contains links/pointers to objects (else where in memory). So fetching the ith element is easy. The buffer also has growth headroom, so append is a simple matter of inserting the new element's link. The buffer has to be reallocated periodically, but Python manages that seamlessly.

A numpy array also has a 1d data buffer, but it contains numeric values (more generally the bytes as required by the dtype). Fetching is easy, but it has to create a new python object to 'contain' the value ('unboxing'). Assignment also requires conversion from the python object to bytes that will be stored.

Generally we've found that making a new array by appending to a list (with one np.array call at the end) is competitive with assignment to a preallocated array.

Iteration through a numpy array is normally slower than iterating through a list.

What we strongly discourage is using np.append (or some variant) to grow an array iteratively. That makes a new array each time, with full copy.

numpy arrays are fast when the iteration is done in the compiled code. Normally that's with the whole-array methods. The c code can iterate over the data buffer, operating on the values directly, rather than invoking python object methods at each step.