Reputation: 110054
If I need to use a data set inside a function (as a lookup table) inside of a package I'm creating do I need to explicitly load the data set inside of the function?
The function and the data set are both part of my package.
Is this the correct way to use that data set inside the function:
foo <- function(x){
x <- dataset_in_question
}
or is this better:
foo <- function(x){
x <- data(dataset_in_question)
}
or is there some approach I'm not thinking of that's correct?
Upvotes: 15
Views: 1474
Reputation: 110054
One can just place the data set as a .rda file in the R folder as described by Hadley here: http://r-pkgs.had.co.nz/data.html#data-sysdata
Matthew Jockers uses this approach in the syuzhet package for data sets including the bing
data set as seen at ~line 452 here: https://github.com/mjockers/syuzhet/blob/master/R/syuzhet.R
bing
is not available to the user but is to the package as demonstrated by: syuzhet:::bing
Essentially, the command devtools::use_data(..., internal = TRUE)
will set everything up in the way it's needed.
Upvotes: 1
Reputation: 1100
For me it was necessary to use get()
additionally to LazyData: true
in DESCRIPTION
file (see postig by @Henrik point 3) to get rid of the NOTE no visible binding for global variable ...
. My R version is 3.2.3
.
foo <- function(x){
get("dataset_in_question")
}
So LazyData makes dataset_in_question
directly accessible (without using data("dataset_in_question", envir = environment())
) and get()
is to satisfy R CMD check
HTH
Upvotes: 1
Reputation: 14460
There was a recent discussion about this topic (in the context of package development) on R-devel, numerous points of which are relevant to this question:
If only the options you provide are applicable to your example R himself (i.e., Brian Ripley) tells you to do:
foo <- function(x){
data("dataset_in_question")
}
This approach will however throw a NOTE in R CMD check which can be avoided in upcoming versions of R (or currently R devel) by using the globalVariables()
function, added by John Chambers
The 'correct' approach (i.e., the one advocated by Brian Ripley and Peter Dalgaard) would be to use the LazyData option for your package. See this section of "Writing R Extensions".
Btw: I do not fully understand how your first approach should work. What should x <- dataset_in_question
do? Is dataset_in_question
a global Variable or defined previously?
Upvotes: 12