time optimization: performance, accessing values in an list of list, list of array numpy

I have been trying to optimize my code.

I compared 4 possible coding choices for getting the value in one cell of a list of list ( or replace list with array).

M = 1000
my_list = [[] for i in range(M)]
for i in range(M):
    for j in range(M):
        my_list[i].append(0)
my_numpy_list = [ np.full(M,1) for i in range(M) ]
time1 = time.time()
for j in range(1000):
    for i in range(10000):
        my_list[0][0]
print( "1  ", time.time() - time1)

time1 = time.time()
for j in range(1000):
    test_list = my_list[0]
    for i in range(10000):
        test_list[0]
print("2 ",time.time() - time1)

for j in range(1000):
    for i in range(10000):
        my_numpy_list[0][0]
print("3 ", time.time() - time1)


for j in range(1000):
    my_numpy_test_list = my_numpy_list[0]
    for i in range(10000):
        my_numpy_test_list[0]
print( "4  ", time.time() - time1)

on my computer, it gives the following times :

1   0.9008669853210449
2  0.7616724967956543
3  2.9174351692199707
4   4.883266925811768

The question is, why is it longer to access values in a numpy array ? If it's longer, what about converting an array into a list in order to access data faster. In particular, I am very surprised that storing the array which was in a list ( case 4) is the slowest case. Shoudln't the time be :

4 < 2 < 3 < 1 ?

Cheers

Upvotes: 0

Views: 264

Answers (1)

riccardo nizzolo
riccardo nizzolo

Reputation: 621

Because the goal of numpy is not to make your access to data faster. Instead the goal of numpy is to allow you to write vectorized code and avoid loops.

Let's modify your example and make your code adding 1 to every element of your list/np.array

M = 1000
my_list = [[] for i in range(M)]
for i in range(M):
    for j in range(M):
        my_list[i].append(0)
my_numpy_array = np.array([ np.full(M,1) for i in range(M) ])
time1 = time.time()

time1 = time.time()
for j in range(1000):
    test_list = my_list[0]
    for i in range(10000):
        test_list[0]+1
print("list case addition",time.time() - time1)

time2 = time.time()
my_numpy_list = my_numpy_array+1
print("numpy case addition",time.time() - time2)

The output is:

list case addition 0.7961978912353516
numpy case addition 0.0031096935272216797

which is about 250 times faster

Upvotes: 1

Related Questions