stephentgrammer
stephentgrammer

Reputation: 600

fread (data.table) select columns, throw error if column not found

I'm loading a csvfile into R with data.table's fread function. It has a bunch of columns that I don't need, so the select parameter comes in handy. I've noticed, however, that if one of the columns specified in the select does not exist in the csvfile, fread will silently continue. Is it possible to make R throw an error if one of the selected columns doesn't exist in the csvfile?

#csvfile has "col1" "col2" "col3" "col4" etc

colsToKeep <- c("col1", "col2" "missing")

data <- fread(csvfile, header=TRUE, select=colsToKeep, verbose=TRUE)

In the above example, data will have two columns: col1, col2. The remaining columns will be dropped as expected, but missing is silently skipped. It would certainly be nice to know that fread is skipping that column because it did not find it.

Upvotes: 9

Views: 10113

Answers (1)

shadowtalker
shadowtalker

Reputation: 13903

I'd suggest parsing the first row pre-emptively, then throwing your own error. You could do:

read_cols <- function(file_name, colsToKeep) {
    header <- fread(file_name, nrows = 1, header = FALSE)
    all_in_header <- all(colsToKeep %chin% unlist(header))
    stopifnot(all_in_header)

    fread(file_name, header=TRUE, select=colsToKeep, verbose=TRUE)
}

my_data <- read_cols(csvfile, c("col1", "col2" "missing"))

Upvotes: 9

Related Questions