How can I improve the speed of this list iteration in Python 3.9.0?

Question

The following code is a measured hot spot, distilled from some code I'm writing. I'm trying to figure out how to speed up this loop in Python 3.9.0. I measure the same loop to be >30x faster using std::vector in VC++ 2019.

As you can see, I've tried a few different methods. The map() function appears to return an iterator, so I converted it to a list to measure the full cost of execution.

I feel this is a fairly natural way to represent my data. I could certainly work on some representational or algorithmic improvements here. However, I'm sort of surprised that iteration is so slow in this case, and I'd like to see if it can be improved, first.

Performance results from executing: python listIteration.py

Iteration by index
66.66 ms
60.90 ms
62.74 ms
Total: 124998250000
Iteration by index -- just integers
55.22 ms
55.27 ms
80.84 ms
Total: 124998250000
Iteration by object
56.48 ms
60.30 ms
55.77 ms
Total: 124998250000
List comprehension
235.34 ms
328.15 ms
272.47 ms
Total: 124998250000
Map
310.81 ms
353.87 ms
300.27 ms
Total: 124998250000

Code:

import time

def makeList():
    data = []
    for i in range(500000):
        data.append([i, i, i])
    return data

def makeListOfInts():
    data = []
    for i in range(500000):
        data.append(i)
    return data    

def dumpTime(delta):
    print("{:.2f}".format(1000.0*delta) + " ms")


NUM_TRIALS = 3

print("Iteration by index");
data = makeList()
for t in range(NUM_TRIALS):
    x1 = time.perf_counter()
    
    for j in range(len(data)):
        data[j][0] -= 1

    x2 = time.perf_counter()
    dumpTime(x2-x1)
total = sum([x[0] for x in data])
print("Total: "+ str(total))

print("Iteration by index -- just integers");
data = makeListOfInts()
for t in range(NUM_TRIALS):
    x1 = time.perf_counter()
    
    for j in range(len(data)):
        data[j] -= 1

    x2 = time.perf_counter()
    dumpTime(x2-x1)
total = sum(data)
print("Total: "+ str(total))

print("Iteration by object");
data = makeList()
for t in range(NUM_TRIALS):
    x1 = time.perf_counter()
    
    for v in data:
        v[0] -= 1

    x2 = time.perf_counter()
    dumpTime(x2-x1)
total = sum([x[0] for x in data])    
print("Total: "+ str(total))

print("List comprehension");
data = makeList()
for t in range(NUM_TRIALS):
    x1 = time.perf_counter()
    
    data = [[x[0]-1, x[1], x[2]] for x in data]
    
    x2 = time.perf_counter()
    dumpTime(x2-x1)
total = sum([x[0] for x in data])
print("Total: "+ str(total))    

print("Map");
data = makeList()
for t in range(NUM_TRIALS):
    x1 = time.perf_counter()
    
    # here we convert the map object to a list, because apparently
    # map() returns an iterator, and we want to measure the full cost
    # of the computation
    data = list(map(lambda x: [x[0]-1, x[1], x[2]], data))
    
    x2 = time.perf_counter()
    dumpTime(x2-x1)
total = sum([x[0] for x in data])
print("Total: "+ str(total))

How can I improve the speed of this list iteration in Python 3.9.0?

Answers (1)

Related Questions