Correlation table

Question

Suppose that you have hundreds of numpy arrays and you want to calculate correlation between each of them. I calculated it with the help of nested for loops. But, execution took huge time(20 minutes!). One way to make this calculation more efficient is to calculate one half of the correlation table diagonal, copy it to other half and make diagonal line equal to 1. What I mean is that, correlation(x,y)=correlation(y,x) and correlation(x,x) is always equal to 1. However, with these corrections, code will also take much time(approx 7-8 minutes). Any other suggestions?

My code
for x in data_set:
    for y in data_set:
        correlation = np.corrcoef(x,y)[1][0]

Correlation table

Answers (1)

Related Questions