Reputation: 301
If I have a data table, foo, in R with a column named "date", I can get the vector of date values by the notation
foo[, date]
(Unlike data frames, date doesn't need to be in quotes).
How can this be done programmatically? That is, if I have a variable x whose value is the string "date", then how to I access the column of foo with that name?
Something that sort of works is to create a symbol:
sym <- as.name(x)
v <- foo[, eval(sym)]
...
As I say, that sort of works, but there is something not quite right about it. If that code is inside a function myFun in package myPackage, then it seems that it doesn't work if I explicitly use the package through:
myPackage::myFun(...)
I get an error message saying "undefined columns selected".
[edited] Some more details
Suppose I create a package called myPackage. This package has a single file with the following in it:
library(data.table)
#' export
myFun <- function(table1) {
names1 <- names(table1)
name1 <- names1[[1]]
sym <- as.Name(name1)
table1[, eval(sym)]
}
If I load that function using R Studio, then
myFun(tbl)
returns the first column of the data table tbl.
On the other hand, if I call
myPackage::myFun(tbl)
it doesn't work. It complains about
Error in .subset(x, j) : invalid subscript type 'builtin'
I'm just curious as to why myPackage:: would make this difference.
Upvotes: 1
Views: 3638
Reputation: 3720
I think the problem is that you've defined myFun
in your global environment, so it only appeared to work.
I changed as.Name
to as.name
, and created a package with the following functions:
library(data.table)
myFun <- function(table1) {
names1 <- names(table1)
name1 <- names1[[1]]
sym <- as.name(name1)
table1[, eval(sym)]
}
myFun_mod <- function(dt) {
# dt[, eval(as.name(colnames(dt)[1]))]
dt[[colnames(dt)[1]]]
}
Then, I tested it using this:
library(data.table)
myDt <- data.table(a=letters[1:3],b=1:3)
myFun(myDt)
myFun_mod(myDt)
myFun
didn't work
myFun_mod
did work
The output:
> library(test)
> myFun(myDt)
Error in eval(expr, envir, enclos) : object 'a' not found
> myFun_mod(myDt)
[1] "a" "b" "c"
then I added the following line to the NAMESPACE file:
import(data.table)
This is what @mnel was talking about with this link: Using data.table package inside my own package
After adding import(data.table)
, both functions work.
I'm still not sure why you got the particular .subset
error, which is why I went though the effort of reproducing the result...
Upvotes: 1
Reputation: 263479
A quick way which points to a longer way is this:
subset(foo, TRUE, date)
The subset
function accepts unquoted symbol/names for its 'subset' and 'select' arguments. (Its author, however, thinks this was a bad idea and suggests we use formulas instead.) This was the jumping off place for sections of Hadley Wickham's Advanced Programming webpages (and book).: http://adv-r.had.co.nz/Computing-on-the-language.html and http://adv-r.had.co.nz/Functional-programming.html . You can also look at the code for subset.data.frame:
> subset.data.frame
function (x, subset, select, drop = FALSE, ...)
{
r <- if (missing(subset))
rep_len(TRUE, nrow(x))
else {
e <- substitute(subset)
r <- eval(e, x, parent.frame())
if (!is.logical(r))
stop("'subset' must be logical")
r & !is.na(r)
}
vars <- if (missing(select))
TRUE
else {
nl <- as.list(seq_along(x))
names(nl) <- names(x)
eval(substitute(select), nl, parent.frame())
}
x[r, vars, drop = drop]
}
The problem with the use of "naked" expressions that get passed into functions is that their evaluation frame is sometimes not what is expected. R formulas, like other functions, carry a pointer to the environment in which they were defined.
Upvotes: 1