Reputation: 901
Given two matrices A
and B
in Python, I would like to find the correlation between the rows in two matrices. The matrices would be of length 5*7.
I would like to find the correlation between each row in A
and B
and average the correlations:
A = data_All_Features_rating1000_topk_nr ;
B = data_All_Features_rating1000_leastk_nr ;
corr_1 = corrcoeff(A[0,:],B[0,:]])
corr_2 = corrcoeff(A[0,:],B[1,:]])
corr_3 = corrcoeff(A[0,:],B[2,:]])
corr_4 = corrcoeff(A[0,:],B[3,:]])
corr_5 = corrcoeff(A[0,:],B[4,:]])
corr_6 = corrcoeff(A[1,:],B[1,:]])
corr_7 = corrcoeff(A[1,:],B[2,:]])
corr_8 = corrcoeff(A[1,:],B[3,:]])
corr_9 = corrcoeff(A[1,:],B[4,:]])
corr_10 = corrcoeff(A[2,:],B[2,:]])
corr_11 = corrcoeff(A[2,:],B[3,:]])
corr_12 = corrcoeff(A[2,:],B[4,:]])
corr_13 = corrcoeff(A[3,:],B[3,:]])
corr_14 = corrcoeff(A[3,:],B[4,:]])
corr_14 = corrcoeff(A[4,:],B[4,:]])
corravg = avg(corr_1,corr_2,...,corr_14).
This is what I do :
topk = 5
corr_res = []
p = 0 ;
for i in range(0,topk):
for j in range(i,topk):
a = data_All_Features_rating1000_topk_nr[i,:]
b = data_All_Features_rating1000_leastk_nr[j,:]
tmp = np.corrcoef(a,b)
print tmp[0,1]
corr_res = corr_res.extend(tmp[0,1])
I get this error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-159-ab1d737eed71> in <module>()
22 tmp = np.corrcoef(a,b)
23 print tmp[0,1]
---> 24 corr_res = corr_res.extend(tmp[0,1])
25 # print p+1
26 # print corr_res
TypeError: 'numpy.float64' object is not iterable
Upvotes: 2
Views: 3191
Reputation: 5070
Efficient way to perform matrix operations in python is using of NumPy library. Exactly for correlation calculation could be user numpy.correlate function. To calculate correlation between all combination of rows you could use
import numpy as np
A = np.array([[1, 2, 3, 4], [2, 3, 5, 6], [1,3,4,5], [7,8,2,3]])
B = np.array([[1, 2, 3, 4], [3, 5, 6, 2], [3,2,4,1], [9,8,2,1]])
corr = []
for i in xrange(len(A)):
for j in xrange(len(B)-i):
corr.extend(np.correlate(A[i], B[j+i]))
corr_avg = np.average(corr)
print corr_avg
print " ".join(map(str, corr))
UPDATE
Instead of
print tmp[0,1]
corr_res = corr_res.extend(tmp[0,1])
Try
print tmp[0,0]
corr_res.append(tmp[0,0])
list
method extend
taking an iterable object on input (like other list, tuple, ...). And if you want to add to list
scalar value you should use append
method.
Upvotes: 2