Reputation: 2194
I would like to compute correlations between columns of a matrix only for some band of the correlations matrix. I know how to get the whole correlation matrix:
X <- matrix(rnorm(20*30), nrow=20)
cor(X)
But as shown in the left figure below, I'm only interested in some band below the main diagonal.
I could try to cleverly subset the orginal matrix to get only the little squares shown in the right figure, but this seems to be cumbersome.
Do you have a better idea/solution to the problem.
EDIT
I forgot to mention this, but I can hardly use a for loop in R, since the dimension of the correlation matrix is rather large (about 2000*2000) and I have to do this process around 100 times.
Upvotes: 1
Views: 297
Reputation: 2446
Try a for loop :
band_cor_mat = matrix(NA, nrow=nrow(X), ncol=ncol(X))
for (cc in 1:ncol(X)) { # Diagonal
for (mm in 1:min(band_width, nrow(X)-cc)) { # Band
band_cor_mat[cc+mm,cc] = cor(X[,cc+mm], X[,cc])
}
}
You will have a correlation matrix, with correlation values in the band, and NAs
for the rest.
Upvotes: 1
Reputation: 546193
You’re probably right that cor
on the whole matrix is faster than using manual loops, since the internal workings of cor
are highly optimised for matrices. But the bigger the matrix (and, conversely, the smaller the band), the more benefit you could reap from manually looping over the band.
That said, maybe just give it a try – the code for the manual loop is trivial:
cor_band = function (x, band_width, method = c('pearson', 'kendall', 'spearman')) {
out = matrix(nrow = ncol(x), ncol = ncol(x))
for (i in 1 : ncol(x))
for (j in i : min(i + band_width, ncol(x)))
out[j, i] = cor(x[, j], x[, i], method = method)
out
}
Note that the indices in out
are reversed so that we get the band below the diagonal rather than above. Since the correlation matrix is symmetrical, either works.
Upvotes: 1