titeuf
titeuf

Reputation: 163

how to make sure that all options for case_when are used?

I would like to recode a variable according to the answers in the df data frame. This variable is 1 if the task takes less than 30mins, it's 0 if longer than 30mins. There is the possibility that some random errors are in my data. In the case that not all answers from the df data frame are used in my case_when() function I would like to write out an error like: "the observation 'random error' was not used in your case_when() function. Any ideas how to solve this? I was thinking about unique(df$col1) and then somehow compare with the observations used in the case_when() function. But I have no clue how to do this in dplyr::transmute...

library(dplyr)

df <- data.frame(col1 = c("<30min", "31-60min", "61-120min", ">120min", NA, "random error"))


df2 <- df %>% 
  transmute(xxx = case_when(
      col1 == "<30min" ~ 1,
      col1 == "31-60min" | col1 == "61-120min" | col1 == ">120min" ~ 0,
      TRUE ~ NA_real_)
    )
  )

df2

Upvotes: 0

Views: 48

Answers (1)

akrun
akrun

Reputation: 887213

Instead of doing mutiple ==, we can make use of %in%

library(dplyr)
df %>%
    transmute(xxx = case_when(col1 == "<30min" ~ 1,
                 col1 %in% c("31-60min", "61-120min", ">120min") ~ 0, 
                 TRUE ~ NA_real_))

Upvotes: 1

Related Questions