neither-nor
neither-nor

Reputation: 1263

Optimized computation of pairwise correlations in Python

Given a set of discrete locations (e.g. "sites") that are pairwise related in some categorical ways (e.g. general proximity) and contains local level data (e.g. population size), I wish to efficiently compute the mean correlation coefficients between local level data on pairwise locations that are characterized by the same relationships.

For example, I assumed 100 sites and randomized their pairwise relations using values 1 to 25, yielding the triangular matrix relations:

import numpy as np

sites = 100
categ = 25

relations = np.random.randint(low=1, high=categ+1, size=(sites, sites))
relations = np.triu(relations) # set relation_ij = relation_ji
np.fill_diagonal(relations, 0) # ignore self-relation

I also have 5000 replicates of simulation results on each site:

sims = 5000
res = np.round(np.random.rand(sites, sims),1)

To compute the mean pairwise correlation for each specific relation category, I first calculated for each relation category i the correlation coefficient rho[j] between the simulation results res of each unique site pairs j, and then taking the average across all possible pairs with relation i:

rho_list = np.ones(categ)*99

for i in range(1, categ+1):
    idr = np.transpose(np.where(relations == i)) # pairwise site indices of the same relation category
    comp = np.vstack([res[idr[:,0]].ravel(), res[idr[:,1]].ravel()]) # pairwise comparisons of simulation results from the same relation category
    comp_uniq = np.reshape(comp.T, (len(idr), res.shape[1], -1)) # reshape above into pairwise comparisons of simulation results between unique site pairs

    rho = np.ones(len(idr))*99 # correlation coefficients of all unique site pairs of current relation category

    for j in range(len(idr)): # loop through unique site pairs
        comp_uniq_s = comp_uniq[j][np.all(comp_uniq!=0, axis=2)[j]].T # shorten comparisons by removing pairs with zero-valued result
        rho[j] = np.corrcoef(comp_uniq_s[0], comp_uniq_s[1])[0,1]

    rho_list[i-1] = np.nanmean(rho)

Although this script works, but once I increase sites = 400, then the entire computation can take more than 6 hrs to finish, which leads me to question my use of array functions. What is the reason for this poor performance? And how can I optimize the algorithm?

Upvotes: 4

Views: 979

Answers (1)

Divakar
Divakar

Reputation: 221504

We can vectorize the innermost loop with j iterator with some masking to take care of the ragged nature of data being processed at each iteration of that loop. We can also take out slowish np.corrcoef (inspired by this post). Additionally, we can optimize few steps at the start of the outer-loop, specially the stacking steps, which could be the bottlenecks.

Thus, the complete code would reduce to something like this -

for i in range(1, categ+1):
    r,c = np.where(relations==i)

    A = res[r]
    B = res[c]

    mask0 = ~((A!=0) & (B!=0))
    A[mask0] = 0
    B[mask0] = 0

    count = mask0.shape[-1] - mask0.sum(-1,keepdims=1)
    A_mA = A - A.sum(-1, keepdims=1)/count
    B_mB = B - B.sum(-1, keepdims=1)/count

    A_mA[mask0] = 0
    B_mB[mask0] = 0

    ssA = np.einsum('ij,ij->i',A_mA, A_mA)
    ssB = np.einsum('ij,ij->i',B_mB, B_mB)
    rho = np.einsum('ij,ij->i',A_mA, B_mB)/np.sqrt(ssA*ssB)

    rho_list[i-1] = np.nanmean(rho)

Runtime tests

Case # 1: On given sample data, with sites = 100

In [381]: %timeit loopy_app()
1 loop, best of 3: 7.45 s per loop

In [382]: %timeit vectorized_app()
1 loop, best of 3: 479 ms per loop

15x+ speedup.

Case # 2: With sites = 200

In [387]: %timeit loopy_app()
1 loop, best of 3: 1min 56s per loop

In [388]: %timeit vectorized_app()
1 loop, best of 3: 1.86 s per loop

In [390]: 116/1.86
Out[390]: 62.36559139784946

62x+ speedup.

Case # 3: Finally with sites = 400

In [392]: %timeit vectorized_app()
1 loop, best of 3: 7.64 s per loop

This took 6hrs+ at OP's end with their loopy method.

From the timings, it became clear that vectorizing the inner loop was the key to getting noticeable speedups for large sites.

Upvotes: 6

Related Questions