Improving the execution time of matrix calculations in Python

Question

I work with a large amount of data and the execution time of this piece of code is very very important. The results in each iteration are interdependent, so it's hard to make it in parallel. It would be awesome if there is a faster way to implement some parts of this code, like:

finding the max element in the matrix and its indices
changing the values in a row/column with the max from another row/column
removing a specific row and column

Filling the weights matrix is pretty fast.

The code does the following:

it contains a list of lists of words word_list, with count elements in it. At the beginning each word is a separate list.
it contains a two dimensional list (count x count) of float values weights (lower triangular matrix, the values for which i>=j are zeros)
in each iteration it does the following:
- it finds the two words with the most similar value (the max element in the matrix and its indices)
- it merges their row and column, saving the larger value from the two in each cell
- it merges the corresponding word lists in word_list. It saves both lists in the one with the smaller index (max_j) and it removes the one with the larger index (max_i).
it stops if the largest value is less then a given THRESHOLD

I might think of a different algorithm to do this task, but I have no ideas for now and it would be great if there is at least a small performance improvement.

I tried using NumPy but it performed worse.

weights = fill_matrix(count, N, word_list)
while 1:
    # find the max element in the matrix and its indices 
    max_element = 0
    for i in range(count):
        max_e = max(weights[i])
        if max_e > max_element:
            max_element = max_e
            max_i = i
            max_j = weights[i].index(max_e)

    if max_element < THRESHOLD:
        break

    # reset the value of the max element
    weights[max_i][max_j] = 0

    # here it is important that always max_j is less than max i (since it's a lower triangular matrix)
    for j in range(count):
        weights[max_j][j] = max(weights[max_i][j], weights[max_j][j])

    for i in range(count):
        weights[i][max_j] = max(weights[i][max_j], weights[i][max_i])

    # compare the symmetrical elements, set the ones above to 0
    for i in range(count):
        for j in range(count):
            if i <= j:
                if weights[i][j] > weights[j][i]:
                    weights[j][i] = weights[i][j]
                weights[i][j] = 0

    # remove the max_i-th column
    for i in range(len(weights)):
        weights[i].pop(max_i)

    # remove the max_j-th row
    weights.pop(max_i)

    new_list = word_list[max_j]
    new_list += word_list[max_i]
    word_list[max_j] = new_list

    # remove the element that was recently merged into a cluster
    word_list.pop(max_i)
    count -= 1

David Kelley · Accepted Answer

It depends on how much work you want to put into it but if you're really concerned about speed you should look into Cython. The quick start tutorial gives a few examples ranging from a 35% speedup to an amazing 150x speedup (with some added effort on your part).

Improving the execution time of matrix calculations in Python

Answers (2)

Related Questions