camille
camille

Reputation: 16842

Multiple column names to quos in dplyr NSE

I'm writing functions to automate a workflow for analyzing a lot of demographic data. I can get what I need from a regular pipe-stream of dplyr functions, but I need to abstract this into NSE functions. I'm supplying a column name to a series of gather calls via a ... argument, but this only works with a single column; I need the option of using multiple columns. I'm having trouble with how to use quos(...) in this case.

There's more to the function, but I'm including just enough to show the error.

Sample of data:

library(tidyverse)

race_pops <- structure(list(
    town = c("Hamden", "Hamden", "Hamden", "Hamden","New Haven", "New Haven", "New Haven", "New Haven", "West Haven","West Haven", "West Haven", "West Haven"), 
    race = c("Total","White", "Black", "Latino", "Total", "White", "Black", "Latino","Total", "White", "Black", "Latino"), 
    est = c(61476, 37043, 13209,6450, 130405, 40164, 42970, 37231, 54972, 28864, 10677, 10977), 
    moe = c(31, 1039, 998, 879, 60, 1395, 1383, 1688, 42, 1226,1119, 1032), 
    region = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L,2L, 1L, 1L, 1L, 1L), .Label = c("Inner Ring", "New Haven"), class = "factor")), 
    class = c("tbl_df","tbl", "data.frame"), row.names = c(NA, -12L))

Here's a working bit that yields my desired output:

race_pops %>%
    gather(key = measure, value = value, est, moe) %>%
    unite("grp2", race, measure, sep = "_") %>%
    spread(key = grp2, value = value) %>%
    gather(key = grp2, value = value, -town, -region, -starts_with("Total")) %>%
    head(10)
#> # A tibble: 10 x 6
#>    town       region     Total_est Total_moe grp2       value
#>    <chr>      <fct>          <dbl>     <dbl> <chr>      <dbl>
#>  1 Hamden     Inner Ring     61476        31 Black_est  13209
#>  2 New Haven  New Haven     130405        60 Black_est  42970
#>  3 West Haven Inner Ring     54972        42 Black_est  10677
#>  4 Hamden     Inner Ring     61476        31 Black_moe    998
#>  5 New Haven  New Haven     130405        60 Black_moe   1383
#>  6 West Haven Inner Ring     54972        42 Black_moe   1119
#>  7 Hamden     Inner Ring     61476        31 Latino_est  6450
#>  8 New Haven  New Haven     130405        60 Latino_est 37231
#>  9 West Haven Inner Ring     54972        42 Latino_est 10977
#> 10 Hamden     Inner Ring     61476        31 Latino_moe   879

This is the function up to the point where I get the error:

gather_grp <- function(df, grp = group, value = est, moe = moe, ...) {
    name_vars <- quos(...)
    grp_var <- enquo(grp)
    value_var <- enquo(value)
    moe_var <- enquo(moe)

    df %>%
        gather(key = measure, value = value, -(!!!name_vars), -(!!grp_var)) %>%
        unite("grp2", !!grp_var, measure, sep = "_") %>%
        spread(key = grp2, value = value) %>%
        gather(key = grp2, value = value, -(!!!name_vars), -starts_with("Total"))
}

The function works if I drop region and use just the single column town:

race_pops %>%
    select(-region) %>%
    gather_grp(grp = race, value = est, moe = moe, town) %>%
    head(10)
#> # A tibble: 10 x 5
#>    town       Total_est Total_moe grp2       value
#>    <chr>          <dbl>     <dbl> <chr>      <dbl>
#>  1 Hamden         61476        31 Black_est  13209
#>  2 New Haven     130405        60 Black_est  42970
#>  3 West Haven     54972        42 Black_est  10677
#>  4 Hamden         61476        31 Black_moe    998
#>  5 New Haven     130405        60 Black_moe   1383
#>  6 West Haven     54972        42 Black_moe   1119
#>  7 Hamden         61476        31 Latino_est  6450
#>  8 New Haven     130405        60 Latino_est 37231
#>  9 West Haven     54972        42 Latino_est 10977
#> 10 Hamden         61476        31 Latino_moe   879

But I can't supply both town and region to the ...:

race_pops %>%
    gather_grp(grp = race, value = est, moe = moe, town, region)
#> Error in (~town): 2 arguments passed to '(' which requires 1

Created on 2018-05-08 by the reprex package (v0.2.0).

Thanks in advance!

Upvotes: 3

Views: 241

Answers (1)

akrun
akrun

Reputation: 887108

We can wrap with c and it should work

gather_grp <- function(df, grp = group, value = est, moe = moe, ...) {
    name_vars <- quos(...)
    grp_var <- enquo(grp)
    value_var <- enquo(value)
    moe_var <- enquo(moe)


    df %>%
        gather(key = measure, value = value, -c(!!!name_vars), -!!grp_var) %>%
        unite("grp2", !!grp_var, measure, sep = "_") %>%
        spread(key = grp2, value = value) %>%
        gather(key = grp2, value = value, -c(!!!name_vars), -starts_with("Total"))
}

-running the function

race_pops %>%
    gather_grp(grp = race, value = est, moe = moe, town, region)
# A tibble: 18 x 6
#   town       region     Total_est Total_moe grp2       value
#   <chr>      <fct>          <dbl>     <dbl> <chr>      <dbl>
# 1 Hamden     Inner Ring     61476        31 Black_est  13209
# 2 New Haven  New Haven     130405        60 Black_est  42970
# 3 West Haven Inner Ring     54972        42 Black_est  10677
# 4 Hamden     Inner Ring     61476        31 Black_moe    998
# 5 New Haven  New Haven     130405        60 Black_moe   1383
# 6 West Haven Inner Ring     54972        42 Black_moe   1119
# 7 Hamden     Inner Ring     61476        31 Latino_est  6450
# 8 New Haven  New Haven     130405        60 Latino_est 37231
# 9 West Haven Inner Ring     54972        42 Latino_est 10977
#10 Hamden     Inner Ring     61476        31 Latino_moe   879
#11 New Haven  New Haven     130405        60 Latino_moe  1688
#12 West Haven Inner Ring     54972        42 Latino_moe  1032
#13 Hamden     Inner Ring     61476        31 White_est  37043
#14 New Haven  New Haven     130405        60 White_est  40164
#15 West Haven Inner Ring     54972        42 White_est  28864
#16 Hamden     Inner Ring     61476        31 White_moe   1039
#17 New Haven  New Haven     130405        60 White_moe   1395
#18 West Haven Inner Ring     54972        42 White_moe   1226

For the single column case, we need to select out the 'region' or 'town' as it will also be a column in the dataset (or that needs to be changed in the function)

race_pops %>% 
    dplyr::select(-region) %>% 
    gather_grp(grp = race, value = est, moe = moe, town)

Upvotes: 4

Related Questions