Reputation: 67
I need the Pearson's correlation coefficient between two matrix X,Y.
If I run the code corr=numpy.corrcoef(X,Y)
my output is a matrix with correlation coefficients.
However I need a single value to represent the correlation between two matrix.
I just saw on this kennytm's answer that to have one value I should write numpy.corrcoef(X,Y)[1,0]
.
This solution works but I don't understand what that numbers inside square brackets mean and why adding them I have as outcome one single value.
I'm interpreting 1 and 0 as limits of the coefficient but what's happen to all the coefficients inside the matrix?
What type of operation is computed on them to obtain a single value?
If I change numbers inside square brackets for example [1,-1]
(correlation, anticorrelation) the value of corr
change so I'm confused which numbers I should use inside brackets.
Upvotes: 4
Views: 6167
Reputation: 4137
numpy.corrcoef
returns a matrix containing the correlation coefficient for every pair of rows. So for example, numpy.corrcoef(A,B)
for A.shape=(3,3)
and B.shape=(3,3)
will return a (6,6)
matrix since there are 36 row combinations. Note it's a symmetric matrix since it returns both correlations for (e.g.) A[1],B[1]
(index [1,4]
) and B[1],A[1]
(index [4,1]
). When you have two 1-D arrays, you get a (2,2)
matrix: the correlation of the first array with itself [0,0]
, the correlation of the first array with the second array [0,1]
, the correlation of the second array with the first array [1,0]
and the correlation of the second array with itself [1,1]
.
import numpy as np
A = np.random.randint(low=0, high=99, size=(3,3))
B = np.random.randint(low=0, high=99, size=(3,3))
C = np.corrcoef(A,B)
print(C[1,4]==np.corrcoef(A[1],B[1])[0,1]) # True
If you want the 2-D correlation (like correlation between images), flatten the 2-D arrays, so you obtain a single row for every array. Then, the element [0,1]
or [1,0]
of that correlation matrix will be how do the 2-D arrays correlate to each other fully.
print(np.corrcoef(A.flatten(), B.flatten())[0,1])
Upvotes: 8