Reputation: 43279
I am trying to read a flat file into R.
It is separated by ';' and has 12 leading lines of comments to describe the content. I want to read the file and exlude the comments.
The problem however is that the commented line 11 contains the data headers as follows:
# Fields: labno; name; dob; sex; location; date
Is there a way that I can extract the headers form the comments and apply them to the data. The way I thought of doing it was to read the first 11 lines only and store everything from labno as a vector. The I would read everything from line 13 and use the store vector as column names for the the date.
Is there a way to read the first 11 lines and remove everything before labno
Thanks.
Upvotes: 1
Views: 1531
Reputation: 263421
Step1: (read only the eleventh row containing column names. )
hdrs <- read.table("somefile.txt", nrows=1, skip=10, comment.char="")
Step2: (read the rest of the file, allowing default automatic names)
dat <- read.table("somefile.txt", skip=12)
Step3: (remove extraneous characters before applying the ‘fields’ as column names)
names(dat) <- scan(textConnection(sub("# Fields\\:", "", hdrs)),
what="character", sep=";")
Later versions of R allow ‘scan’ to have a ‘text’ argument rather than requiring the awkward textConnection function.
Upvotes: 6