Tseng
Tseng

Reputation: 15

Need help turning this function into a custom function

I have this df with 4 columns: Name, screen_date, enroll_date, and screen2enroll_days

Name screen_date enroll_date screen2enroll_days enrollment_type
John 2020-08-20 2020-08-01 14 TypeX
Mike 2020-08-20 2020-08-01 14 TypeY
Sam 2020-10-20 2020-08-05 65 TypeY
Dan 2020-11-05 2020-08-05 90 TypeX
df <-
  data.frame(
    "Name" = c("John", "Mike", "Sam", "Dan"),
    "screen_date" = c("2020-08-01", "2020-08-20", "2020-10-20", "2020-11-05"),
    "enroll_date" = c("2020-08-01", "2020-08-01", "2020-08-05", "2020-08-05"),
    "screen2enroll_days" = c(14, 14, 65, 90),
    "enrollment_type" = c("TypeX", "TypeY", "TypeY", "TypeX")
  )

I want to create a function to read in one or more of my columns and create a new column called Action that uses the column screen2enroll_days to identify if a client needs a screening test. But ran into errors

Name screen_date enroll_date screen2enroll_days Action (new_col)
John 2020-08-14 2020-08-01 14 Up-to-date
Sam 2020-10-20 2020-08-05 65 Requires Screening
Dan 2020-11-05 2020-08-05 90 No Screening Required
Mike 2020-08-20 2020-08-01 14 No Screening Required
mutate_function <- function(df, new_col, my_col, my_col2, value1, value2, value3) {
    
    df %>% mutate(new_col = case_when(
             my_col <= 14 ~ "value1",
             my_col <= 14 & my_col2 != "TypeX" ~ "value2",
             (my_col > 14 & my_col <= 65) ~ "value3",
             TRUE ~ "value2")
    )}

mutate_function(df, Action, mycol = screen2enroll_days, my_col2 = enrollment_type, "Up-to-date", "Requires Screening", "No Screening Required")

Upvotes: 0

Views: 41

Answers (1)

Ben
Ben

Reputation: 30549

I think there are a number of things to address to make this functional:

  • Inside a function you can dynamically access column names from your arguments with double curly braces ({{...}}). Alternatively, use can use the bang-bang operator with sym: !!sym(). Or, try .data[[variable]] to reference the variable from the pipe. Otherwise, it would seem you are trying to reference a column called my_col or my_col2 (for example) from df which don't exist.

  • If you want to set the new column values based on the value1 value2 or value3 arguments, you will want to leave off the quotes in your case_when statement

  • To dynamically set the new_col, use assignment (:=)

  • When calling the function, you may want to double check your argument names (such as mycol vs. my_col - note underscore)

  • Finally, you may want to double check your case_when logic. I believe the second line might never get called, as all circumstances when my_col is <= 14 will be considered as value1


library(dplyr)

mutate_function <- function(df, new_col, my_col, my_col2, value1, value2, value3) {
  df %>% mutate({{new_col}} := case_when(
    {{my_col}} <= 14 ~ value1,
    {{my_col}} <= 14 & {{my_col2}} != "TypeX" ~ value2,
    ({{my_col}} > 14 & {{my_col}} <= 65) ~ value3,
    TRUE ~ value2)
  )
}

mutate_function(df, 
                new_col = "Action", 
                my_col = "screen2enroll_days", 
                my_col2 = "enrollment_type", 
                "Up-to-date", 
                "Requires Screening", 
                "No Screening Required")

Upvotes: 2

Related Questions