Reputation: 23
I'm trying to use the tidyverse to make a function that allows me to take a vector of column names (of class factor), count how many observations meet a certain condition (value == "yes"), mutate()
and make new columns using this sum so I can summarise the data later.
I have written a function that can do this for a single column, but I want to be able to pass a vector of column names of any length using all_of()
grammar to the function. When I try this, I get a new column with the same name as the vector (including my _count
suffix) not the values inside the vector.
This is similar to TimTeaFan's Post but I would like to pass a vector where length > 1.
I think I might need to use the (. . .) option in dplyr and some more rlang, but I haven't quite figured out the right combination. The function doesn't work when I substitute (. . .) for objective thanks to NSE. I've also tried variations of rlang::as_name()
and rlang::get_env()
. There was an outdated workbook that used purrr::map()
as well, but I haven't had any luck implementing it here.
I either got Error: Column x
is not found.
or Error: Promise has already been forced
Here is a reproducible example with data
dat <- tibble(category = rep(letters[1:10], 2),
intake = factor(c(rep("no", 12), rep("yes", 8))),
outtake = factor(c(rep("yes", 11), rep("no", 9))),
pretake = factor(c(rep(c("no", "yes"), 10))))
yessum <- function(.data, objective) {
.data %>%
dplyr::mutate("{{objective}}_count" := sum(
ifelse(
unlist(!!rlang::ensym(objective)) == "yes", 1, 0)))
}
dat %>%
group_by(category) %>%
yessum(intake)
I want to be able to pass a vector of certain column names to yessum and receive a set of new columns just like intake_new
but named outtake_new
and pretake_new
.
Here's what currently happens when I try that:
vars <- c("intake", "outtake", "pretake")
dat %>%
group_by(category) %>%
yessum(vars)
Any and all help is welcome!
Upvotes: 2
Views: 188
Reputation: 16836
You don't necessarily need the function, as you can just mutate
across
the columns and get sums for each category.
library(tidyverse)
dat %>%
group_by(category) %>%
mutate(across(ends_with("take"), .fns = list(count = ~sum(. == "yes"))))
Or if you have a long list, then you can use vars
directly in the across
statement:
vars <- c("intake", "outtake", "pretake")
dat %>%
group_by(category) %>%
mutate(across(vars, .fns = list(count = ~sum(. == "yes"))))
Output
category intake outtake pretake intake_count outtake_count pretake_count
<chr> <fct> <fct> <fct> <int> <int> <int>
1 a no yes no 0 2 0
2 b no yes yes 0 1 2
3 c no yes no 1 1 0
4 d no yes yes 1 1 2
5 e no yes no 1 1 0
6 f no yes yes 1 1 2
7 g no yes no 1 1 0
8 h no yes yes 1 1 2
9 i no yes no 1 1 0
10 j no yes yes 1 1 2
11 a no yes no 0 2 0
12 b no no yes 0 1 2
13 c yes no no 1 1 0
14 d yes no yes 1 1 2
15 e yes no no 1 1 0
16 f yes no yes 1 1 2
17 g yes no no 1 1 0
18 h yes no yes 1 1 2
19 i yes no no 1 1 0
20 j yes no yes 1 1 2
Upvotes: 1