Luca
Luca

Reputation: 1350

read matrix from fixed-width file in R

i'm new in the R world, i have a file that contain a row series like this:

"0000010000010000000101000001000000011000000001
 0000000000000000000000010001000001001000110001
 0000100000000000000000010000000000000000010100
 0100000001100000000001001001100000010000000001
 0001000000000100010000010000000000010000000000"

and i want to build a matrix starting from this string. Since now i have wrote this code:

for(line in readLines(ff)){
     line <- as.numeric(substring(line, seq(1,nchar(line),1), seq(1,nchar(line),1)))
}

but it only extracts the lines from the file, how do i use the line vector to build a matrix?

Upvotes: 2

Views: 135

Answers (2)

juba
juba

Reputation: 49033

EDIT : Thanks to Ananda Matho's and agstudy's suggestions, here is a much better way to automatically handle the width argument. If your data is in a file called test.txt, you can do :

width <- nchar(readLines("test.txt", n=1))
m <- as.matrix(read.fwf("test.txt", widths=rep(1,width)))

I assume that each 0/1 is a distinct value. In this case, you can use read.fwf, which allows to read data by specifying the width of each field :

text <- "0000010000010000000101000001000000011000000001
0000000000000000000000010001000001001000110001
0000100000000000000000010000000000000000010100
0100000001100000000001001001100000010000000001
0001000000000100010000010000000000010000000000"

m <- as.matrix(read.fwf(textConnection(text), widths=rep(1,46)))

Which gives :

R> m
     V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19
[1,]  0  0  0  0  0  1  0  0  0   0   0   1   0   0   0   0   0   0   0
[2,]  0  0  0  0  0  0  0  0  0   0   0   0   0   0   0   0   0   0   0
[3,]  0  0  0  0  1  0  0  0  0   0   0   0   0   0   0   0   0   0   0
[4,]  0  1  0  0  0  0  0  0  0   1   1   0   0   0   0   0   0   0   0
[5,]  0  0  0  1  0  0  0  0  0   0   0   0   0   1   0   0   0   1   0
     V20 V21 V22 V23 V24 V25 V26 V27 V28 V29 V30 V31 V32 V33 V34 V35 V36
[1,]   1   0   1   0   0   0   0   0   1   0   0   0   0   0   0   0   1
[2,]   0   0   0   0   1   0   0   0   1   0   0   0   0   0   1   0   0
[3,]   0   0   0   0   1   0   0   0   0   0   0   0   0   0   0   0   0
[4,]   0   0   1   0   0   1   0   0   1   1   0   0   0   0   0   0   1
[5,]   0   0   0   0   1   0   0   0   0   0   0   0   0   0   0   0   1
     V37 V38 V39 V40 V41 V42 V43 V44 V45 V46
[1,]   1   0   0   0   0   0   0   0   0   1
[2,]   1   0   0   0   1   1   0   0   0   1
[3,]   0   0   0   0   0   1   0   1   0   0
[4,]   0   0   0   0   0   0   0   0   0   1
[5,]   0   0   0   0   0   0   0   0   0   0

In your case, you will replace textConnection(text)) with your file name, and modify the value 46 in rep(1,46) by the numbers of values in each row of your matrix.

Upvotes: 6

johannes
johannes

Reputation: 14413

You could also use:

t <- readLines(textConnection("0000010000010000000101000001000000011000000001
0000000000000000000000010001000001001000110001
0000100000000000000000010000000000000000010100
0100000001100000000001001001100000010000000001
0001000000000100010000010000000000010000000000"))

do.call("rbind", lapply(strsplit(t, ""), as.numeric))

     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
[1,]    0    0    0    0    0    1    0    0    0     0     0     1     0     0
[2,]    0    0    0    0    0    0    0    0    0     0     0     0     0     0
[3,]    0    0    0    0    1    0    0    0    0     0     0     0     0     0
[4,]    0    1    0    0    0    0    0    0    0     1     1     0     0     0
[5,]    0    0    0    1    0    0    0    0    0     0     0     0     0     1
     [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26]
[1,]     0     0     0     0     0     1     0     1     0     0     0     0
[2,]     0     0     0     0     0     0     0     0     0     1     0     0
[3,]     0     0     0     0     0     0     0     0     0     1     0     0
[4,]     0     0     0     0     0     0     0     1     0     0     1     0
[5,]     0     0     0     1     0     0     0     0     0     1     0     0
     [,27] [,28] [,29] [,30] [,31] [,32] [,33] [,34] [,35] [,36] [,37] [,38]
[1,]     0     1     0     0     0     0     0     0     0     1     1     0
[2,]     0     1     0     0     0     0     0     1     0     0     1     0
[3,]     0     0     0     0     0     0     0     0     0     0     0     0
[4,]     0     1     1     0     0     0     0     0     0     1     0     0
[5,]     0     0     0     0     0     0     0     0     0     1     0     0
     [,39] [,40] [,41] [,42] [,43] [,44] [,45] [,46]
[1,]     0     0     0     0     0     0     0     1
[2,]     0     0     1     1     0     0     0     1
[3,]     0     0     0     1     0     1     0     0
[4,]     0     0     0     0     0     0     0     1
[5,]     0     0     0     0     0     0     0     0

Upvotes: 6

Related Questions