micstr
micstr

Reputation: 5206

R data.table setting the key within a function

My issue is passing a data table to a set-up function, and then setting the key (with setkeyv) in this function is lost when exiting the function.

Please can one of the DT gurus explain what I am doing wrong. I was expecting variable dt to keep the key set within the function, as I thought the data tables stick to use by reference. What am I misunderstanding about scoping, and how do I fix it? (I am using data.table 1.9.6, R 3.2.2)

library(data.table)

# test data here for Stack overflow
#   yes I know one can key the data.table here
#   but normally read my data from csv file
dt <- data.table(date = c("2015-12-31","2016-01-01"), 
                 class = c("a","b"),
                 units = c(1000, 200))
tables()
# no key as you expect as not set
#NAME         NROW NCOL MB COLS                                                                         KEY
#[1,] dt              2    3  1 date,class,units

Then I want to clean imported csv data in a function, and key the table.

PrepareData <- function(x.dt, date.col, key.col) {
  # prepare unit price data table by keying on given columns and 
  #   converting date columns to date class

  require(data.table)

  # convert dates if date.col not blank
  if (!missing(date.col)) {
    if (nchar(date.col[1]) > 1) {
      for (j in date.col) {
        set(x.dt, j=j , 
            value = as.IDate(parse_date_time(x.dt[[j]], c("Ymd", "dmY"))))
        # Since data.table likes integer based dates
      }
    }    
  }

  # add key 
  if (!missing(key.col)) {
    if (nchar(key.col[1]) > 1) {
      setkeyv(x.dt, key.col)
    } 
  }

  # tables here shows a key is set
  tables()
  #NAME NROW NCOL MB COLS             KEY       
  #[1,] x.dt    2    3  1 date,class,units date,class

  return(x.dt)
}

But calling this function loses the key - my expectation was for key to be preserved when passed back.

my.key.cols <-c("date", "class") # key columns

dt <- PrepareData(dt, "date", my.key.cols) 

tables()
#NAME NROW NCOL MB COLS             KEY
#[1,] dt      2    3  1 date,class,units    
#
# why did dt not keep the key? How should I fix this?

EDIT: SOLVED DT PACKAGE CAUSING THE ISSUE

R version 3.2.2 (2015-08-14)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 8 x64 (build 9200)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ggplot2_2.0.0    DT_0.1           lubridate_1.5.0  data.table_1.9.6

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.2      digest_0.6.8     plyr_1.8.3       chron_2.3-47     grid_3.2.2       gtable_0.1.2     magrittr_1.5    
 [8] scales_0.3.0     stringi_1.0-1    tools_3.2.2      stringr_1.0.0    htmlwidgets_0.5  munsell_0.4.2    colorspace_1.2-6
[15] htmltools_0.3 

Removing DT fixed this

Upvotes: 1

Views: 72

Answers (1)

micstr
micstr

Reputation: 5206

Big Thanks to @David Arenburg for confirming it worked his side and pointing out the old package conflict chestnut!

Running sessionInfo() showed I had DT package loaded. This was conflicting with data.table. Removing package fixed the error.

R version 3.2.2 (2015-08-14)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 8 x64 (build 9200)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ggplot2_2.0.0    DT_0.1           lubridate_1.5.0  data.table_1.9.6

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.2      digest_0.6.8     plyr_1.8.3       chron_2.3-47     grid_3.2.2       gtable_0.1.2     magrittr_1.5    
 [8] scales_0.3.0     stringi_1.0-1    tools_3.2.2      stringr_1.0.0    htmlwidgets_0.5  munsell_0.4.2    colorspace_1.2-6
[15] htmltools_0.3 

FIX

# specifying data.table explicitly with :: helped
data.table::setkeyv(x.dt, key.col)

I am leaving this solution in case others run into this conflict.

Upvotes: 1

Related Questions