Reputation: 2201
I was under the impression that numpy would be faster for list operations, but the following example seems to indicate otherwise:
import numpy as np
import time
def ver1():
a = [i for i in range(40)]
b = [0 for i in range(40)]
for i in range(1000000):
for j in range(40):
b[j]=a[j]
def ver2():
a = np.array([i for i in range(40)])
b = np.array([0 for i in range(40)])
for i in range(1000000):
for j in range(40):
b[j]=a[j]
t0 = time.time()
ver1()
t1 = time.time()
ver2()
t2 = time.time()
print(t1-t0)
print(t2-t1)
Output is:
4.872278928756714
9.120521068572998
(I'm running 64-bit Python 3.4.3 in Windows 7, on an i7 920)
I do understand that this isn't the fastest way to copy a list, but I'm trying to find out if I'm using numpy incorrectly. Or is it the case that numpy is slower for this kind of operation and is only more efficient in more complex operations?
EDIT:
I also tried the following, which just just does a direct copy via b[:] = a, and numpy is still twice as slow:
import numpy as np
import time
def ver6():
a = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
b = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
for i in range(1000000):
b[:] = a
def ver7():
a = np.array([0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0])
b = np.array([0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0])
for i in range(1000000):
b[:] = a
t0 = time.time()
ver6()
t1 = time.time()
ver7()
t2 = time.time()
print(t1-t0)
print(t2-t1)
Output is:
0.36202096939086914
0.6750380992889404
Upvotes: 5
Views: 1435
Reputation: 67417
Most of what you are seeing is Python object creation from C native types.
A Python list is, at it's heart, an array of PyObject
pointers. When a
and b
are both Python lists, doing b[i] = a[i]
will imply:
b[i]
,a[i]
, anda[i]
into b[i]
.But if a
and b
are NumPy arrays, things are a little more ellaborate, and the same b[i] = a[i]
then requires:
a[i]
, see this,b[i]
, see here, andSo the difference is mostly in creating and disposing of that intermediate Python object, that lists do not need to do.
Upvotes: 1
Reputation: 280182
You're using NumPy wrong. NumPy's efficiency relies on doing as much work as possible in C-level loops instead of interpreted code. When you do
for j in range(40):
b[j]=a[j]
That's an interpreted loop, with all the intrinsic interpreter overhead and more, because NumPy's indexing logic is way more complex than list indexing, and NumPy needs to create a new element wrapper object on every element retrieval. You're not getting any of the benefits of NumPy when you write code like this.
You need to write the code in such a way that the work happens in C:
b[:] = a
This would also improve the efficiency of the list operation, but it's much more important for NumPy.
Upvotes: 6