Reputation: 11
I have a large correlation matrix (something like 50*50). I calculated the matrix using cor(mydata) function. Now I would like to have equal significance matrix. Using cor.test() I can have one significance level but is there a easy way to get all 1200?
Upvotes: 0
Views: 658
Reputation: 11
The function cor_pmat
from the ggcorrplot package gives you the p-values of correlations.
library(ggcorrplot)
set.seed(123)
xmat <- matrix(rnorm(50), ncol = 5)
cor_pmat(xmat)
[,1] [,2] [,3] [,4] [,5]
[1,] 0.00000000 0.08034470 0.24441138 0.03293644 0.3234899
[2,] 0.08034470 0.00000000 0.08716815 0.44828479 0.4824117
[3,] 0.24441138 0.08716815 0.00000000 0.20634394 0.9504582
[4,] 0.03293644 0.44828479 0.20634394 0.00000000 0.8378530
[5,] 0.32348990 0.48241166 0.95045815 0.83785303 0.0000000
Upvotes: 1
Reputation: 23101
Here is one solution:
data <- swiss
#cor(data)
n <- ncol(data)
p.value.vec <- apply(combn(1:ncol(data), 2), 2, function(x)cor.test(data[,x[1]], data[,x[2]])$p.value)
p.value.matrix = matrix(0, n, n)
p.value.matrix[upper.tri(p.value.matrix, diag=FALSE)] = p.value.vec
p.value.matrix[lower.tri(p.value.matrix, diag=FALSE)] = p.value.vec
p.value.matrix
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.000000e+00 1.491720e-02 9.450437e-07 1.028523e-03 1.304590e-06 2.588308e-05
[2,] 1.491720e-02 0.000000e+00 3.658617e-07 3.585238e-03 5.204434e-03 4.453814e-01
[3,] 9.450437e-07 9.951515e-08 0.000000e+00 9.951515e-08 6.844724e-01 3.018078e-01
[4,] 3.658617e-07 1.304590e-06 4.811397e-08 0.000000e+00 4.811397e-08 5.065456e-01
[5,] 1.028523e-03 5.204434e-03 2.588308e-05 3.018078e-01 0.000000e+00 2.380297e-01
[6,] 3.585238e-03 6.844724e-01 4.453814e-01 5.065456e-01 2.380297e-01 0.000000e+00
Upvotes: 0
Reputation: 10761
I think this should do what you want, we use expand.grid
in conjunction with the apply
function:
Since you didn't provide your data, I created my own set.
set.seed(123)
xmat <- matrix(rnorm(50), ncol = 5)
matrix(apply(expand.grid(1:ncol(xmat), 1:ncol(xmat)),
1,
function(x) cor.test(xmat[,x[1]], xmat[,x[2]])$`p.value`),
ncol = ncol(xmat), byrow = T)
[,1] [,2] [,3] [,4] [,5]
[1,] 0.00000000 0.08034470 0.24441138 3.293644e-02 0.3234899
[2,] 0.08034470 0.00000000 0.08716815 4.482848e-01 0.4824117
[3,] 0.24441138 0.08716815 0.00000000 2.063439e-01 0.9504582
[4,] 0.03293644 0.44828479 0.20634394 1.063504e-62 0.8378530
[5,] 0.32348990 0.48241166 0.95045815 8.378530e-01 0.0000000
Note that if you didn't want a matrix
, and instead were comfortable with a data.frame
, we could use combn
which would involve much less iteration and be more efficient.
cbind(t(combn(1:ncol(xmat), 2)),
combn(1:ncol(xmat), 2, function(x) cor.test(xmat[,x[1]], xmat[,x[2]])$`p.value`)
)
[,1] [,2] [,3]
[1,] 1 2 0.08034470
[2,] 1 3 0.24441138
[3,] 1 4 0.03293644
[4,] 1 5 0.32348990
[5,] 2 3 0.08716815
[6,] 2 4 0.44828479
[7,] 2 5 0.48241166
[8,] 3 4 0.20634394
[9,] 3 5 0.95045815
[10,] 4 5 0.83785303
Alternatively, we can perform the same operation, but use the pipe operator %>%
to make it a bit more concise:
library(magrittr)
combn(1:ncol(xmat), 2) %>%
apply(., 2, function(x) cor.test(xmat[,x[1]], xmat[,x[2]])$`p.value`) %>%
cbind(t(combn(1:ncol(xmat), 2)), .)
Upvotes: 0