Match information from a correlation matrix according to their p-value cut off

Question

I have used rcorr function of Hmisc library for calculation of correlations and p-values. Then extracted pvalues to Pval matrix and correlation coefficients to corr matrix.

Rvalue<-structure(c(1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 
0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 
1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 
1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1), .Dim = c(10L, 
10L), .Dimnames = list(c("41699", "41700", "41701", "41702", 
"41703", "41704", "41705", "41707", "41708", "41709"), c("41699", 
"41700", "41701", "41702", "41703", "41704", "41705", "41707", 
"41708", "41709")))

> Pvalue<-structure(c(NA, 0, 0, 0, 0.0258814351024321, 0, 0, 0, 0, 0, 0, 
NA, 6.70574706873595e-14, 0, 0, 2.1673942640632e-09, 1.08217552696743e-07, 
0.0105345133269157, 0, 0, 0, 6.70574706873595e-14, NA, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, NA, 2.22044604925031e-15, 0, 0, 0, 0, 
0, 0.0258814351024321, 0, 0, 2.22044604925031e-15, NA, 0, 0, 
0, 0.000322310440723728, 0.00298460759118657, 0, 2.1673942640632e-09, 
0, 0, 0, NA, 0, 0, 0, 0, 0, 1.08217552696743e-07, 0, 0, 0, 0, 
NA, 0, 0, 0, 0, 0.0105345133269157, 0, 0, 0, 0, 0, NA, 0, 0, 
0, 0, 0, 0, 0.000322310440723728, 0, 0, 0, NA, 0, 0, 0, 0, 0, 
0.00298460759118657, 0, 0, 0, 0, NA), .Dim = c(10L, 10L), .Dimnames = list(
c("41699", "41700", "41701", "41702", "41703", "41704", "41705", 
"41707", "41708", "41709"), c("41699", "41700", "41701", 
"41702", "41703", "41704", "41705", "41707", "41708", "41709"
)))

Then I converted corr matrix to Boolean matrix (0,1) which number one means good correlation. Then I want to math good correlations with significant pvalues. I need an edge list including the p-value. I implemented following code:

n=1
m=list()
for(i in 1:nrow(Rvalue))
  {
  for (j in 1:nrow(Rvalue))
    {
if (i



then, then output is:

> m
[[1]]
[1] "41699" "41700" "0"    

[[2]]
[2] "41699" "41701" "0"    

[[3]]
[3] "41699" "41702" "0"    

[[4]]
[4] "41699" "41704" "0" 
...


Result is OK, but since the matrices are very big, it needs much time. How can I speed up this process? Please note that I need node names. Is there any related functions?
I also have found two similar questions but not exactly what I needed (+ and +). Thanks in advance.

akrun · Accepted Answer

You could try

indx <- which(Rvalue==1 & Pvalue < 0.05 & !is.na(Pvalue), arr.ind=TRUE)
d1 <- data.frame(rN=row.names(Rvalue)[indx[,1]], 
               cN=colnames(Rvalue)[indx[,2]], Pval=signif(Pvalue[indx],
                                                                digits=4))

head(d1,2)
#     rN    cN Pval
#1 41700 41699    0
#2 41701 41699    0

Update

Not sure why you are getting the same result when you change the cutoff. It may be possible that the P values may be too small that it would be TRUE in the cutoffs you tried. Here is an example to show that it does return different values. Suppose, I create a function from the above code,

 f1 <- function(Rmat, Pmat, cutoff){
   indx <- which(Rmat==1 & Pmat < cutoff & !is.na(Pmat), arr.ind=TRUE)
    d1 <- data.frame(rN=row.names(Rmat)[indx[,1]], 
              cN=colnames(Rmat)[indx[,2]], Pval=signif(Pmat[indx],
                                                            digits=4))
 d1}

 f1(R1, P1, 0.05)
 #  rN cN  Pval
 #1  B  A 0.021
 #2  C  A 0.018
 #3  D  A 0.001
 #4  A  B 0.021
 #5  A  C 0.018
 #6  E  C 0.034
 #7  A  D 0.001
 #8  C  E 0.034

 f1(R1, P1, 0.01)
 #  rN cN  Pval
 #1  D  A 0.001
 #2  A  D 0.001

 f1(R1, P1, 0.001)
 #[1] rN   cN   Pval
 #<0 rows> (or 0-length row.names)

data

set.seed(24)
R1 <- matrix(sample(c(0,1), 5*5, replace=TRUE), 5,5, 
            dimnames=list(LETTERS[1:5], LETTERS[1:5]))
R1[lower.tri(R1)] <- 0
R1 <- R1+t(R1)
diag(R1) <- 1


set.seed(49)
P1 <- matrix(sample(seq(0,0.07, by=0.001), 5*5, replace=TRUE), 5, 5,
       dimnames=list(LETTERS[1:5], LETTERS[1:5]))

P1[lower.tri(P1)] <- 0
P1 <- P1+t(P1)
diag(P1) <- NA

Match information from a correlation matrix according to their p-value cut off

Answers (2)

Update

data

Related Questions