Reputation: 11686
A lot of the time in dplyr, we do something like:
mydat %>% select(., mycol1, mycol2, mycol3)
However, mycol1, mycol2, and mycol3 are not strings but just text in R. How does the function know to convert it into a string.
For instance, if I were to do:
dat <- data.frame(blue = rnorm(100), red= rnorm(100))
mysum <- function(dat, x, y){
browser()
return (sum(dat$x)+ sum(dat$y))
}
mysum(dat, blue, red)
Upvotes: 1
Views: 127
Reputation: 263332
Your function is always going to deliver 0
because the $
infix function uses non-standard evaluation of its right-hand side argument. (As you point out, non-standard evaluation is a favorite mechanism in @hadley's functions. For me it's a barrier, but for many people it seems to be a welcome strategy.) If you write your function in that manner (using $
) you will generally fail to get what you want:
mysum(dat, blue, red)
[1] 0 # Wrong answer
You said earlier that: "However, mycol1, mycol2, and mycol3 are not strings but just text in R." I guess you are trying to say that mycol
is not enclosed in quotes and so is not a character literal. In R such "text" (a sequence of unquoted characters) is called a 'symbol' or a 'name'. (Up to this point we are not talking about anything to do with dplyr.) If you want to write a function that will deliver that sum, you would do so like this (avoiding the $
operation):
mysum <- function(dat, x, y){
return (sum(dat[[x]])+ sum(dat[[y]]))
}
mysum(dat, 'blue', 'red')
[1] 19.16727
If you want to retrieve the argument name for a matched parameter you need to use the deparse( substitute(.))
-maneuver:
dat <- data.frame(blue = rnorm(10), red= rnorm(10))
mysum2 <- function(dfrm, arg1, arg2){
a1 <- deparse(substitute(arg1)); a2 <- deparse(substitute(arg2))
sum(dfrm[[a1]]) +sum(dfrm[[a2]]) }
mysum2(dat, blue, red)
#[1] -0.5754979
mysum(dat, "blue", "red")
#[1] -0.5754979
If you want to see how @hadley does, then it just type:
> dplyr::select
function (.data, ...)
{
select_(.data, .dots = lazyeval::lazy_dots(...))
}
<environment: namespace:dplyr>
.... doesn't really deliver the answer, does it? So we will need to try this:
help(pac=lazyeval)
... which has an accompanying vignette named "lazyeval::lazyeval" --> "Lazyeval: a new approach to NSE". Hadley argues that his lazyeval
functions are superior to the traditional substitute
because they carry forward their environments, and suppose I do agree.
Upvotes: 4