julieth
julieth

Reputation: 442

Function with dplyr with two variables

I am trying to carry out the following dplyr task, but within a function.

library("dplyr")

iris %>%
    group_by(Species) %>%
    summarise(N = sum(Petal.Width == 0.2, na.rm = T))

I was thinking along the lines of the following, which is not complete because I am unclear on the syntax.

getSummary <- function(varName,level) {
    summary <- iris %>%
        group_by(Species %>%
        summarise_(N = interp(~sum(var == ilevel, na.rm = T), 
                   var = as.name(varName))))
    sums <- summary$N       
}

In this case levels is the numeric 0.2. Are there any changes if the value is a character "0.2"?

Upvotes: 1

Views: 226

Answers (1)

alistaire
alistaire

Reputation: 43334

dplyr is in the process of switching over from a lazyeval-powered NSE system to an rlang-powered one. On the new version (available now through the GitHub version, and soon through CRAN), you can use

library(dplyr)

getSummary <- function(varName, level) {
    varName <- enquo(varName)    # parse and quote variable name
    iris %>%
        group_by(Species) %>%
        summarise(N = sum((!!varName) == level),    # unquote with !! to use
                  var = rlang::quo_text(varName))    # turn quosure to string  
}

getSummary(Petal.Width, 0.2)
#> # A tibble: 3 × 3
#>      Species     N         var
#>       <fctr> <int>       <chr>
#> 1     setosa    29 Petal.Width
#> 2 versicolor     0 Petal.Width
#> 3  virginica     0 Petal.Width

# or make it accept strings
getSummary <- function(varName, level) {
    iris %>%
        group_by(Species) %>%
        summarise(N = sum((!!rlang::sym(varName)) == level), 
                  var = varName) 
}

getSummary('Sepal.Length', 5.0)
#> # A tibble: 3 × 3
#>      Species     N          var
#>       <fctr> <int>        <chr>
#> 1     setosa     8 Sepal.Length
#> 2 versicolor     2 Sepal.Length
#> 3  virginica     0 Sepal.Length

To use the old lazyeval syntax, it would look like

getSummary <- function(varName, level) {
    iris %>%
        group_by(Species) %>%
        summarise_(N = lazyeval::interp(~sum(x == y),    # formula to substitute into
                                        x = lazyeval::lazy(varName),    # substituted but unevaluated name
                                        y = level),    # value to substitute
                   var = ~lazyeval::expr_text(varName))    # convert expression to string (equivalent to `deparse(substitute(...))`)
}

getSummary(Sepal.Length, 5.0)
#> # A tibble: 3 × 3
#>      Species     N          var
#>       <fctr> <int>        <chr>
#> 1     setosa     8 Sepal.Length
#> 2 versicolor     2 Sepal.Length
#> 3  virginica     0 Sepal.Length

# or make it accept strings
getSummary <- function(varName, level) {
    iris %>%
        group_by(Species) %>%
        summarise_(N = lazyeval::interp(~sum(x == y), 
                                        x = as.name(varName), 
                                        y = level),
                   var = ~varName)
}

getSummary('Petal.Width', 0.2)
#> # A tibble: 3 × 3
#>      Species     N         var
#>       <fctr> <int>       <chr>
#> 1     setosa    29 Petal.Width
#> 2 versicolor     0 Petal.Width
#> 3  virginica     0 Petal.Width

Upvotes: 3

Related Questions