Max Black
Max Black

Reputation: 13

Passing Column Name as Parameter to data.table::setkey() --- some columns are not in the data.table: col_name

So, essentially what I'm wanting is similar to these two posts: 1, 2. However, when I try the solutions, I keep getting an error.

How my problem differs is that I am using the data.table package and am trying to set a key value. For more details, see here.

Now, for the sake of an example, let's suppose I have a data frame and set its key as below:

data <- data.table::as.data.table(data.frame(A = c(1, 2, 3), B = c("one", "two", "three")))
 data <- data.table::setkey(data, A)

This works. Now, I can then filter by some other data structure as below:

matches <- data[c(1)]

The above line will create a data.table that is a subset of data where the value of the variable A is 1.

Now, let's suppose I'd like to make this a generic function. I cannot get the below to work:

genericFunction <- function(data, col_name, filter){
    #Convert data.frame to data.table
    data <- data.table::as.data.table(data)

    #Set the key based on a variable name
    #Error is in this step
    data <- data.table::setkey(data, col_name)

    #Save the subset of data
    matches <- data[c(sorter)]

    return(matches)
}

That is, if I go to do the following:

exampleData <- data.frame(A = c(1, 2, 3), B = c("one", "two", "three"))
exampleName <- "A"
exampleFilter <- 1

genericFunction(exampleData, exampleName, exampleFilter)

I get the following error:

 Error in setkeyv(x, cols, verbose = verbose, physical = physical) : 
  some columns are not in the data.table: col_name 

I know I'm suppose to use lazyeval::interp() or something along those lines, however, the implementations in the example links above do not work for me. Does anyone have any ideas as to what I should do? Any help is appreciated.

Upvotes: 0

Views: 768

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389135

Not a data.table expert but ?setkey says :

setkey(x, ..., verbose=getOption("datatable.verbose"), physical = TRUE)

... - The columns to sort by. Do not quote the column names.

which means you cannot pass quoted column names here.

You can use setkeyv :

setkeyv(x, cols, verbose=getOption("datatable.verbose"), physical = TRUE)

cols - A character vector of column names
genericFunction <- function(data, col_name, filter){
  #Convert data.frame to data.table
  data <- data.table::as.data.table(data)
  
  data <- data.table::setkeyv(data, col_name)
  
  #Save the subset of data
  matches <- data[c(filter)]
  
  return(matches)
}

exampleData <- data.frame(A = c(1, 2, 3), B = c("one", "two", "three"))
exampleName <- "A"
exampleFilter <- 1

genericFunction(exampleData, exampleName, exampleFilter)

#   A   B
#1: 1 one

Upvotes: 1

Related Questions