Dev Aggarwal
Dev Aggarwal

Reputation: 8516

numpy array assignment is slower than python list

numpy-

arr = np.array([[1, 2, 3, 4]])
row = np.array([1, 2, 3, 4])
%timeit arr[0] = row
466 ns ± 12.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

python list -

arr = [[1, 2, 3, 4]]
row = [1, 2, 3, 4]
%timeit arr[0] = row
59.3 ns ± 2.94 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each

Shouldn't numpy be the faster version here?


Here's what I'm aiming to get done -

arr = np.empty((150, 4))

while True:
     row = get_row_from_api() 
     arr[-1] = row

Upvotes: 2

Views: 1276

Answers (2)

Mad Physicist
Mad Physicist

Reputation: 114230

The underlying semantics of the two operations are very different. Python lists are arrays of references. Numpy arrays are arrays of the data itself.

The line row = get_row_from_api() implies that a fresh list has already been allocated.

Assigning to a list as lst[-1] = row just writes an address into lst. That's generally 4 or 8 bytes.

Placing in an array as arr[i] = row is copying data. It's a shorthand for arr[i, :] = row. Every element of row gets copied to the buffer of arr. If row was a list, that incurs additional overhead to convert from python objects to native numerical types.

Remember that premature optimization is pointless. Your time savings for one method vs the other are likely to be negligible. At the same time, if you need an array later down the line anyway, it's likely faster to pre-allocate and take a small speed hit rather than calling np.array on the final list. In the former case, you allocate a buffer of predetermined size and dtype. In the latter, you've merely deferred the overhead of copying the data, but also incurred the overhead of having to figure out the array size and dtype.

Upvotes: 1

Michał Słodki
Michał Słodki

Reputation: 168

Yep, using python lists this way would definitely be faster, because when you assign something to a python list element, it's not copied, just some references are reassigned (https://developers.google.com/edu/python/lists). Numpy instead copies all the elements from the source container to the target one. I'm not sure whether you need numpy arrays here, because their creation is not free and python lists aren't that slow at creation (and as we see at assignment as well).

Upvotes: 1

Related Questions