Reputation: 115
Is there any convenient way in R to read a specific column (or multiple columns) from a fixed-width data file? E.g. the file looks like this:
10010100100002000000
00010010000001000000
10010000001002000000
Say, I would be interested in column 15. At the moment I am reading the whole data with read.fwf and as width a vector of 1's with length of the total number of columns:
data <- read.fwf("demo.asc", widths=rep(1,20))
data[,14]
[1] 2 1 2
This works well, but doesn't scale to data-sets with 100,000s of columns and rows. Is there any efficient way how to do this?
Upvotes: 3
Views: 965
Reputation: 179398
You can use a connection and process the file in blocks:
Replicate your data:
dat <-"10010100100002000000
00010010000001000000
10010000001002000000"
Process in blocks using a connection:
# Define a connection
con = textConnection(dat)
# Do the block update
linesPerUpdate <- 2
result <- character()
repeat {
line <- readLines(con, linesPerUpdate)
result <- c(result, substr(line, start=14, stop=14))
if (length(line) < linesPerUpdate) break
}
# Close the connection
close(con)
The result:
result
[1] "2" "1" "2"
Upvotes: 3