Shashank Trivedi
Shashank Trivedi

Reputation: 65

How to reduce execution time while evaluating each item in a list in python

I have to build a distance matrix for locations

addresses = [(10.0, 20.0), (21.2318, 72.903), (26.4499, 80.3319), (20.0, 20.0), (19.114, 72.8927), (20.4189, 77.0153), (28.5377, 77.1217), (28.6201, 77.118), (28.5257, 77.2781)]

where distance in between two locations, is computed using the following function

def calculate_travel_distance(pointA,pointB):  
  return geodesic(pointA,pointB).meters

Distance Matrix is being build up using for loop as below

dist_matrix=[]
for pt1 in addresses:
  row_list =  [round(calculate_travel_distance(pt1,pt2),0) for pt2 in addresses]  
  dist_matrix.append(row_list)

When size of addresses grow up, it takes long time to execute i.e. 300 locations ( lat/lng pair) it takes 150 seconds to execute. Is it possible to bring down execution time to few secs(may be 10) for close to 400 locations. Please suggest.

Upvotes: 0

Views: 147

Answers (1)

Tomalak
Tomalak

Reputation: 338326

Assuming you have a list of points A through J, then the matrix pairing all of those up looks like this:

AA AB AC AD AE AF AG AH AI AJ
BA BB BC BD BE BF BG BH BI BJ
CA CB CC CD CE CF CG CH CI CJ
DA DB DC DD DE DF DG DH DI DJ
EA EB EC ED EE EF EG EH EI EJ
FA FB FC FD FE FF FG FH FI FJ
GA GB GC GD GE GF GG GH GI GJ
HA HB HC HD HE HF HG HH HI HJ
IA IB IC ID IE IF IG IH II IJ
JA JB JC JD JE JF JG JH JI JJ

That's what your loop currently calculates. However, the distance AB and BA is equal, and the distances on the center line (AA, BB, ...) are always zero.

We can cut out workload in half (even less than half, from n^2 to n^2 / 2 - n) by calculating only the distances between points where x < y in the matrix.

   AB AC AD AE AF AG AH AI AJ
      BC BD BE BF BG BH BI BJ
         CD CE CF CG CH CI CJ
            DE DF DG DH DI DJ
               EF EG EH EI EJ
                  FG FH FI FJ
                     GH GI GJ
                        HI HJ
                           IJ

The blanks can be filled easily by mirroring the upper triangle. Staying within the example, this:

addresses = ['A','B','C','D','E','F','G','H','I','J']

distances = []

for x, a in enumerate(addresses):
    row = []
    distances.append(row)
    for y, b in enumerate(addresses):
        if x < y:
            row.append(a + b)               # actually calculate something
        elif x == y:
            row.append('--')                # that's always 0
        else:
            row.append(distances[y][x])     # we already calculated that

for row in distances:
    print(' '.join(row))

gives us this:

-- AB AC AD AE AF AG AH AI AJ
AB -- BC BD BE BF BG BH BI BJ
AC BC -- CD CE CF CG CH CI CJ
AD BD CD -- DE DF DG DH DI DJ
AE BE CE DE -- EF EG EH EI EJ
AF BF CF DF EF -- FG FH FI FJ
AG BG CG DG EG FG -- GH GI GJ
AH BH CH DH EH FH GH -- HI HJ
AI BI CI DI EI FI GI HI -- IJ
AJ BJ CJ DJ EJ FJ GJ HJ IJ --

The next step up in speed could be multi-threading, but maybe this optimization already is good enough for your use case.


A multi-threaded implementation of the above could look like this (it's probably not pythonic in more ways than one, but it does the job):

from multiprocessing import cpu_count
from multiprocessing.dummy import Pool as ThreadPool

# credit https://stackoverflow.com/a/54802737
def chunks(l, n):
    """Yield n number of striped chunks from l."""
    for i in range(0, n):
        yield l[i::n]

def calculate_travel_distance(a, b):
    return a + b

def calculate_distance_matrix(addresses):
    # prepare distance matrix, list of lists with n^2 slots
    distance_matrix = [['--' for a in addresses] for b in addresses]

    # the workload is the upper matrix triangle (where x < y)
    # since we're multi-threading, also remember the x,y position
    workload = [((a, b),(x, y)) for y, b in enumerate(addresses) for x, a in enumerate(addresses) if x < y]

    # worker function 
    def worker(chunk):
        return [(calculate_travel_distance(*points), matrix_pos) for points, matrix_pos in chunk]

    # distribute workload over available CPU cores
    pool = ThreadPool(cpu_count())
    result_chunks = pool.map(worker, chunks(workload, cpu_count()))

    # distribute result chunks into their slots
    for result_chunk in result_chunks:
        for result, matrix_pos in result_chunk:
            x, y = matrix_pos
            distance_matrix[x][y] = result
            distance_matrix[y][x] = result

    return distance_matrix

addresses = ['A','B','C','D','E','F','G','H','I','J']
distaince_matrix = calculate_distance_matrix(addresses)

for row in distaince_matrix:
    print(' '.join(row))

It prints the same thing:

-- AB AC AD AE AF AG AH AI AJ
AB -- BC BD BE BF BG BH BI BJ
AC BC -- CD CE CF CG CH CI CJ
AD BD CD -- DE DF DG DH DI DJ
AE BE CE DE -- EF EG EH EI EJ
AF BF CF DF EF -- FG FH FI FJ
AG BG CG DG EG FG -- GH GI GJ
AH BH CH DH EH FH GH -- HI HJ
AI BI CI DI EI FI GI HI -- IJ
AJ BJ CJ DJ EJ FJ GJ HJ IJ --

Upvotes: 1

Related Questions