Using rlang and purrr to create new columns based on subset of existing columns

Question

I have a dataframe that covers multiple years like this:

library(dplyr)

df <- tibble(good_2018 = 0,
             bad_2018 = 1,
             id_2018 = 0,
             good_2019 = 3,
             bad_2019 = 1,
             id_2019 = 1)

I want to derive new columns based on the data for each year t (e.g., 2018 and 2019). If the id variable for year t does not equal 0, then the outcome should be the percentage identified as good for year t. The resulting dataset should look like this:

df %>% 
  mutate(pct_good_2018 = if_else(id_2018 == 0, 0,
                                 100*good_2018/(good_2018 + bad_2018)),
         pct_good_2019 = if_else(id_2019 == 0, 0,
                                 100*good_2019/(good_2019 + bad_2019)))
#> # A tibble: 1 × 8
#>   good_2018 bad_2018 id_2018 good_2019 bad_2019 id_2019 pct_good_2018 pct_good…¹
#>                                         
#> 1         0        1       0         3        1       1             0         75
#> # … with abbreviated variable name ¹pct_good_2019

Instead of generating the pct_good columns for each year individually, would like to use the purrr package, but I cannot figure out how to do it. I believe it requires rlang, but the various configurations of != and {{}} that I try yield errors that I do not understand.

GuedesBF · Accepted Answer

We can use glue to create dynamic column names to use in a custom-function:

library(purrr)
library(glue)
pct_good <-function(df, year) {
    if_else(pull(df, glue('id_{year}')) == 0,
            0,
            100 * pull(df, glue('good_{year}')) / (pull(df, glue('good_{year}')) + pull(df, glue('bad_{year}'))))
}

Then we can use purrr:map_dfc to create a dataframe column for every iteration:

df %>%
    mutate(map_dfc(c(2018, 2019), ~pct_good(df, .x))

# A tibble: 1 × 8
  good_2018 bad_2018 id_2018 good_2019 bad_2019 id_2019  ...1  ...2
                           
1         0        1       0         3        1       1     0    75

Using rlang and purrr to create new columns based on subset of existing columns

Answers (2)

Related Questions