R - text input file format and acceptable header characters or fields

Question

I have a data file that I need to read into R but am running into problems, and to resolve them I've been trying to find some kind of guide to specific header information that R can accept/read in a text input file. Unfortunately, I haven't been able to find anything relating to what the input file looks like - only about the commands used to import various file types.

As to my specific situation, I have a text file (with the extension .dat) that starts with various lines giving additional information about the various columns in the file that start with @, and followed by standard CSV layout. I'm guessing that the lines starting with @ can be read in and affect the structure of my data frame after input, although it's possible that this format isn't used by R. I'm also doing all of this in RStudio on Ubuntu with R version 3.0.2.

The text file looks like this:

@relation bupa
@attribute Mcv integer [65.0, 103]
@attribute Alkphos integer [23.0, 138]
@attribute Sgpt integer [4.0, 155]
@attribute Sgot integer [5.0, 82]
@attribute Gammagt integer [5.0, 297]
@attribute Drinks real [0.0, 20.0]
@attribute Selector {1,2}
@inputs Mcv, Alkphos, Sgpt, Sgot, Gammagt, Drinks
@outputs Selector
@data
85.0, 92.0, 45.0, 27.0, 31.0, 0.0, 1
85.0, 64.0, 59.0, 32.0, 23.0, 0.0, 2
...

Now, I could simply skip these rows as unnecessary and just start reading from the actual data lines, but I'd like to try and bring this data in if I can.

In case this is just an issue with the command I'm using to import, the specific code I've used to import their associated error messages are:

> bupa2 <- read.csv("/bupa/bupa.dat", sep=",", header=T)
Error in read.table(file = file, header = header, sep = sep, quote = quote,  : 
  duplicate 'row.names' are not allowed
> bupa2 <- read.csv("/bupa/bupa.dat", sep=", ")
Error in scan(file, what = "", sep = sep, quote = quote, nlines = 1, quiet = TRUE,  : 
  invalid 'sep' value: must be one byte
> bupa2 <- read.csv("/bupa/bupa.dat", sep=",")
Error in read.table(file = file, header = header, sep = sep, quote = quote,  : 
  duplicate 'row.names' are not allowed
> bupa2 <- read.table("/bupa/bupa.dat", sep=",")
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
  line 9 did not have 2 elements
> bupa2 <- read.table("/bupa/bupa.dat")
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
  line 1 did not have 5 elements
> bupa2 <- scan("/bupa/bupa.dat")
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
  scan() expected 'a real', got '@relation'

What kinds of fields can be accepted by R in a text input file before the data? Is this file an R-supported format? Is there a special command associated with this format that I can use to import it?

Thank you.

R - text input file format and acceptable header characters or fields

Answers (1)

Related Questions