Reputation: 325
I have got two matrices say, T1 and T2 each of size mxn. I want to find the correlation coefficient between two matrices
So far I haven't used any built-in library function for it. I am doing the following steps for it:
First I calculate the mean of the two matrices as:
M1 = T1.mean()
M2 = T2.mean()
and then I subtract the mean from the corresponding matrices as:
A = np.subtract(T1, M1)
B = np.subtract(T2, M2)
where np is the numpy library and A and B are the resulting matrices after doing the subtraction.
Now , I calculate the correlation coefficent as:
alpha = np.sum(A*B) / (np.sqrt((np.sum(A))*np.sum(B)))
However, the value i get is far greater than 1 and in not meaningful at all. It should be in between 0 and 1 to get some meaning out of it.
I have also tried to make use absolute values of matrix A and B, but that also did'nt work.
I also tried to use :
np.sum(np.dot(A,B.T)) instead of np.sum(A*B)
in the numerator , but that also didn't work.
Edit1:
This is the formula that I intend to calculate:
In this image, C is one of the matrices and T is another one.
'u' is the mean symbol.
Can somebody tell me where actually i am doing the mistake.
Upvotes: 3
Views: 19541
Reputation: 620
From the way the problem is described in the OP, the matrices are treated as arrays, so one could simply flatten them:
x = T1.flatten()
y = T2.flatten()
One could then use either the builtin numpy function proposed by @AakashMakwana:
import numy as np
r = np.corrcoef(x, y)[0,1]
Remark: Note that without flattening this solution would produce the matrix of pairwise correlations.
Alternatively, one could use the equivalent scipy
function:
from scipy.stats import pearsonr
r = pearsonr(x,y)[0]
Scipy additionally provides possibility of calculating Spearman correlation coefficient (spearmanr(x,y)[0]
) or Kendall tau (kendalltau(x,y)[0]
).
Upvotes: 0
Reputation: 325
Well I think this function is doing what I intend for:
def correlation_coefficient(T1, T2):
numerator = np.mean((T1 - T1.mean()) * (T2 - T2.mean()))
denominator = T1.std() * T2.std()
if denominator == 0:
return 0
else:
result = numerator / denominator
return result
The calculation of numerator seems to be tricky here which doesn't exactly reflect the formula shown in the above image and denominator is just the product of standard deviations of the two images.
However, the result does make a sense now as the result lies only in between 0 and 1.
Upvotes: 0
Reputation: 754
Can you try this:
import numpy as np
x = np.array([[0.1, .32, .2, 0.4, 0.8], [.23, .18, .56, .61, .12]])
y = np.array([[2,4,0.1, .32, .2],[1,3,.23, .18, .56]])
pearson = np.corrcoef(x,y)
print(pearson)
Upvotes: 1