nalzok
nalzok

Reputation: 16107

Reading and reconstructing symmetric matrix with R

I need to read the following matrix from a file. It's a symmetric correlation matrix, so half of it is omitted.

  1.00  
  0.49  1.00  
  0.53  0.57  1.00  
  0.49  0.46  0.48  1.00  
  0.51  0.53  0.57  0.57  1.00  
  0.33  0.30  0.31  0.24  0.38  1.00  
  0.32  0.21  0.23  0.22  0.32  0.43  1.00  
  0.20  0.16  0.14  0.12  0.17  0.27  0.33  1.00  
  0.19  0.08  0.07  0.19  0.23  0.24  0.26  0.25  1.00  
  0.30  0.27  0.24  0.21  0.32  0.34  0.54  0.46  0.28  1.00  
  0.37  0.35  0.37  0.29  0.36  0.37  0.32  0.29  0.30  0.35  1.00  
  0.21  0.20  0.18  0.16  0.27  0.40  0.58  0.45  0.27  0.59  0.31  1.00

Currently, I'm using

data1 <- na.omit(as.vector(t(read.table('triangle-data.txt', fill = TRUE))))
pt <- 12
R <- matrix(0, nrow = pt , ncol = pt)
for(i in 1:pt){
  R[i, 1:i] <- data1[(i*(i-1)/2 + 1): (i*(i+1)/2)]
}
R <- R + t(R) - diag(rep(1, pt))
R

The result is

> dput(R)
structure(c(1, 0.49, 0.53, 0.49, 0.51, 0.33, 0.32, 0.2, 0.19, 
0.3, 0.37, 0.21, 0.49, 1, 0.57, 0.46, 0.53, 0.3, 0.21, 0.16, 
0.08, 0.27, 0.35, 0.2, 0.53, 0.57, 1, 0.48, 0.57, 0.31, 0.23, 
0.14, 0.07, 0.24, 0.37, 0.18, 0.49, 0.46, 0.48, 1, 0.57, 0.24, 
0.22, 0.12, 0.19, 0.21, 0.29, 0.16, 0.51, 0.53, 0.57, 0.57, 1, 
0.38, 0.32, 0.17, 0.23, 0.32, 0.36, 0.27, 0.33, 0.3, 0.31, 0.24, 
0.38, 1, 0.43, 0.27, 0.24, 0.34, 0.37, 0.4, 0.32, 0.21, 0.23, 
0.22, 0.32, 0.43, 1, 0.33, 0.26, 0.54, 0.32, 0.58, 0.2, 0.16, 
0.14, 0.12, 0.17, 0.27, 0.33, 1, 0.25, 0.46, 0.29, 0.45, 0.19, 
0.08, 0.07, 0.19, 0.23, 0.24, 0.26, 0.25, 1, 0.28, 0.3, 0.27, 
0.3, 0.27, 0.24, 0.21, 0.32, 0.34, 0.54, 0.46, 0.28, 1, 0.35, 
0.59, 0.37, 0.35, 0.37, 0.29, 0.36, 0.37, 0.32, 0.29, 0.3, 0.35, 
1, 0.31, 0.21, 0.2, 0.18, 0.16, 0.27, 0.4, 0.58, 0.45, 0.27, 
0.59, 0.31, 1), .Dim = c(12L, 12L))

This is too unwieldy, and I need to hard-code its size. Is there a more convenient way?

Upvotes: 0

Views: 76

Answers (1)

Vincent Guillemot
Vincent Guillemot

Reputation: 3429

I used a combination of readLines and strsplit to read the file

a <- sapply(sapply(lapply(readLines("triangle.txt"), 
                          function(x) strsplit(x, " ")), "[", 1), 
            function(x) na.omit(as.numeric(x)))

and rbind to cast it into a square matrix

A <- do.call("rbind", a)

Despite the warning, the lower part of the matrix is correctly read from the file, but the upper part is all messed up, which I fixed with a little dirty trick

A[upper.tri(A)] <- 0
A <- A + t(A) - diag(nrow(A))

EDIT

Another simpler solution based on the vector of the coefficients:

data1 <- na.omit(as.vector(t(read.table('triangle.txt', fill = TRUE))))
n <- Re(polyroot(c(-length(data1), 1/2, 1/2)))[1]
A <- matrix(0, n, n)
A[upper.tri(A, diag = T)] <- data1
A <- A + t(A) - diag(n)

Upvotes: 1

Related Questions