Reputation: 1541
I have n arrays of length m, I want to take pairwise Pearson correlation among arrays, and take average of them.
The arrays are saved as a numpy array with shape (n, m)
One way to do it is to write "two for loop operation". However, I would like to know can this be written in python in a more simplified way?
My current code looks like this:
sum_dd = 0
counter_dd = 0
for i in range(len(stc_data_roi)):
for j in range(i+1, len(stc_data_roi)):
sum_dd += np.corrcoef(stc_data_roi[i], stc_data_roi[j])
counter += 1
Upvotes: 0
Views: 595
Reputation: 407
Suppose you have n=4 arrays of length m=5
n = 4
m = 5
X = np.random.rand(n, m)
print(X)
array([[0.49017121, 0.58751099, 0.87868983, 0.75328938, 0.16491984],
[0.81175397, 0.26486309, 0.42424784, 0.37485824, 0.66667452],
[0.80901099, 0.84121723, 0.36623767, 0.59928036, 0.22773295],
[0.59606777, 0.63301654, 0.30963807, 0.82884099, 0.95136045]])
Now transpose the array and convert to a dataframe. Each column of the dataframe represents one array and then use pandas corr function.
df = pd.DataFrame(X.T)
corr_coef = df.corr(method="pearson")
print(corr_coef)
Each column of corr_coef will represent correlation coefficient with other arrays including itself (where it will be one).
#sum of relevant coefficients as per your code
#Subtract by 4 because we don't want self correlation
#Divide by 2 becasue we are adding twice
corr_coef_sum = (corr_coef.sum().sum() - n) / 2
corr_coef_avg = corr_coef_sum / 6 #Total 6 combination in our example case
Upvotes: 1