Valtteri Hemming
Valtteri Hemming

Reputation: 11

R Correlation significance matrix

I have a large correlation matrix (something like 50*50). I calculated the matrix using cor(mydata) function. Now I would like to have equal significance matrix. Using cor.test() I can have one significance level but is there a easy way to get all 1200?

Upvotes: 0

Views: 658

Answers (3)

David
David

Reputation: 11

The function cor_pmat from the ggcorrplot package gives you the p-values of correlations.

library(ggcorrplot)
set.seed(123)
xmat <- matrix(rnorm(50), ncol = 5)
cor_pmat(xmat)


          [,1]       [,2]       [,3]       [,4]      [,5]
[1,] 0.00000000 0.08034470 0.24441138 0.03293644 0.3234899
[2,] 0.08034470 0.00000000 0.08716815 0.44828479 0.4824117
[3,] 0.24441138 0.08716815 0.00000000 0.20634394 0.9504582
[4,] 0.03293644 0.44828479 0.20634394 0.00000000 0.8378530
[5,] 0.32348990 0.48241166 0.95045815 0.83785303 0.0000000

Upvotes: 1

Sandipan Dey
Sandipan Dey

Reputation: 23101

Here is one solution:

data <- swiss
#cor(data)
n <- ncol(data)
p.value.vec <- apply(combn(1:ncol(data), 2), 2, function(x)cor.test(data[,x[1]], data[,x[2]])$p.value)
p.value.matrix = matrix(0, n, n)
p.value.matrix[upper.tri(p.value.matrix, diag=FALSE)] = p.value.vec
p.value.matrix[lower.tri(p.value.matrix, diag=FALSE)] = p.value.vec
p.value.matrix

            [,1]         [,2]         [,3]         [,4]         [,5]         [,6]
[1,] 0.000000e+00 1.491720e-02 9.450437e-07 1.028523e-03 1.304590e-06 2.588308e-05
[2,] 1.491720e-02 0.000000e+00 3.658617e-07 3.585238e-03 5.204434e-03 4.453814e-01
[3,] 9.450437e-07 9.951515e-08 0.000000e+00 9.951515e-08 6.844724e-01 3.018078e-01
[4,] 3.658617e-07 1.304590e-06 4.811397e-08 0.000000e+00 4.811397e-08 5.065456e-01
[5,] 1.028523e-03 5.204434e-03 2.588308e-05 3.018078e-01 0.000000e+00 2.380297e-01
[6,] 3.585238e-03 6.844724e-01 4.453814e-01 5.065456e-01 2.380297e-01 0.000000e+00

Upvotes: 0

bouncyball
bouncyball

Reputation: 10761

I think this should do what you want, we use expand.grid in conjunction with the apply function:

Since you didn't provide your data, I created my own set.

set.seed(123)
xmat <- matrix(rnorm(50), ncol = 5)
matrix(apply(expand.grid(1:ncol(xmat), 1:ncol(xmat)),
      1, 
      function(x) cor.test(xmat[,x[1]], xmat[,x[2]])$`p.value`),
      ncol = ncol(xmat), byrow = T)

           [,1]       [,2]       [,3]         [,4]      [,5]
[1,] 0.00000000 0.08034470 0.24441138 3.293644e-02 0.3234899
[2,] 0.08034470 0.00000000 0.08716815 4.482848e-01 0.4824117
[3,] 0.24441138 0.08716815 0.00000000 2.063439e-01 0.9504582
[4,] 0.03293644 0.44828479 0.20634394 1.063504e-62 0.8378530
[5,] 0.32348990 0.48241166 0.95045815 8.378530e-01 0.0000000

Note that if you didn't want a matrix, and instead were comfortable with a data.frame, we could use combn which would involve much less iteration and be more efficient.

cbind(t(combn(1:ncol(xmat), 2)),
    combn(1:ncol(xmat), 2, function(x) cor.test(xmat[,x[1]], xmat[,x[2]])$`p.value`)
)

      [,1] [,2]       [,3]
 [1,]    1    2 0.08034470
 [2,]    1    3 0.24441138
 [3,]    1    4 0.03293644
 [4,]    1    5 0.32348990
 [5,]    2    3 0.08716815
 [6,]    2    4 0.44828479
 [7,]    2    5 0.48241166
 [8,]    3    4 0.20634394
 [9,]    3    5 0.95045815
[10,]    4    5 0.83785303

Alternatively, we can perform the same operation, but use the pipe operator %>% to make it a bit more concise:

library(magrittr)
combn(1:ncol(xmat), 2) %>%
    apply(., 2, function(x) cor.test(xmat[,x[1]], xmat[,x[2]])$`p.value`) %>%
    cbind(t(combn(1:ncol(xmat), 2)), .)

Upvotes: 0

Related Questions