Reputation: 15
I have this df with 4 columns: Name, screen_date, enroll_date, and screen2enroll_days
Name | screen_date | enroll_date | screen2enroll_days | enrollment_type |
---|---|---|---|---|
John | 2020-08-20 | 2020-08-01 | 14 | TypeX |
Mike | 2020-08-20 | 2020-08-01 | 14 | TypeY |
Sam | 2020-10-20 | 2020-08-05 | 65 | TypeY |
Dan | 2020-11-05 | 2020-08-05 | 90 | TypeX |
df <-
data.frame(
"Name" = c("John", "Mike", "Sam", "Dan"),
"screen_date" = c("2020-08-01", "2020-08-20", "2020-10-20", "2020-11-05"),
"enroll_date" = c("2020-08-01", "2020-08-01", "2020-08-05", "2020-08-05"),
"screen2enroll_days" = c(14, 14, 65, 90),
"enrollment_type" = c("TypeX", "TypeY", "TypeY", "TypeX")
)
I want to create a function to read in one or more of my columns and create a new column called Action that uses the column screen2enroll_days to identify if a client needs a screening test. But ran into errors
Name | screen_date | enroll_date | screen2enroll_days | Action (new_col) |
---|---|---|---|---|
John | 2020-08-14 | 2020-08-01 | 14 | Up-to-date |
Sam | 2020-10-20 | 2020-08-05 | 65 | Requires Screening |
Dan | 2020-11-05 | 2020-08-05 | 90 | No Screening Required |
Mike | 2020-08-20 | 2020-08-01 | 14 | No Screening Required |
mutate_function <- function(df, new_col, my_col, my_col2, value1, value2, value3) {
df %>% mutate(new_col = case_when(
my_col <= 14 ~ "value1",
my_col <= 14 & my_col2 != "TypeX" ~ "value2",
(my_col > 14 & my_col <= 65) ~ "value3",
TRUE ~ "value2")
)}
mutate_function(df, Action, mycol = screen2enroll_days, my_col2 = enrollment_type, "Up-to-date", "Requires Screening", "No Screening Required")
Upvotes: 0
Views: 41
Reputation: 30549
I think there are a number of things to address to make this functional:
Inside a function you can dynamically access column names from your arguments with double curly braces ({{...}}
). Alternatively, use can use the bang-bang operator with sym
: !!sym()
. Or, try .data[[variable]]
to reference the variable from the pipe. Otherwise, it would seem you are trying to reference a column called my_col
or my_col2
(for example) from df
which don't exist.
If you want to set the new column values based on the value1
value2
or value3
arguments, you will want to leave off the quotes in your case_when
statement
To dynamically set the new_col
, use assignment (:=
)
When calling the function, you may want to double check your argument names (such as mycol
vs. my_col
- note underscore)
Finally, you may want to double check your case_when
logic. I believe the second line might never get called, as all circumstances when my_col
is <= 14 will be considered as value1
library(dplyr)
mutate_function <- function(df, new_col, my_col, my_col2, value1, value2, value3) {
df %>% mutate({{new_col}} := case_when(
{{my_col}} <= 14 ~ value1,
{{my_col}} <= 14 & {{my_col2}} != "TypeX" ~ value2,
({{my_col}} > 14 & {{my_col}} <= 65) ~ value3,
TRUE ~ value2)
)
}
mutate_function(df,
new_col = "Action",
my_col = "screen2enroll_days",
my_col2 = "enrollment_type",
"Up-to-date",
"Requires Screening",
"No Screening Required")
Upvotes: 2