a nony mouse
a nony mouse

Reputation: 89

Rowwise or select across columns non specific in casewhen depending on type variable

I have a sample dataframe like this:

library(tidyverse)

id <-c(1:5)
type <- c("inf", "nif","nif","inf","inf")
tx_1 <-c("no", "no", "no","no", "yes")
tx_2 <- c("yes", "no", "no","yes", "no")
tx_3 <- c("no", "yes", "no","no", "no")
tx_4 <- c("yes", "yes", "no","yes", "no")
tx_5 <- c("no", "no", "no","yes", "no")
df <- data.frame(id,type,tx_1,tx_2,tx_3,tx_4,tx_5) |> 
  mutate(valid =  case_when(
    type == "inf" &
      any(contains("tx")) == "yes" ~ 1,
    TYPE == "nif" &
      rowwise
    TRUE ~ 0
  ))

it contains an ID and a type variable along with an incomplete mutate.

The goal is to create a column called "valid" based on two conditions.

IF type == "inf" and any of the tx columns contain "yes" the value is 1

OR

If type == "nif" and any two consecutive tx columns contain "yes" the value is 1

else 0

I would like to not name the columns specifically since it is possible as time goes on n value of tx columns can be added so I was thinking of using tidyselect functions like starts_with or contains but that only works within a select function and not in a case_when.

Any assistance is appreciated.

Upvotes: 2

Views: 35

Answers (2)

TarJae
TarJae

Reputation: 79194

A combinnation of any(c_across(... and lag will do the trick for the more challenging second condition:

library(dplyr)

  df %>%
    rowwise() %>%
    mutate(valid = case_when(
      type == "inf" & any(c_across(starts_with("tx")) == "yes") ~ 1,
      type == "nif" & any(c_across(starts_with("tx")) == "yes" & 
                            lag(c_across(starts_with("tx"))) == "yes") ~ 1,
      TRUE ~ 0
    ))

 id type  tx_1  tx_2  tx_3  tx_4  tx_5  valid
  <int> <chr> <chr> <chr> <chr> <chr> <chr> <dbl>
1     1 inf   no    yes   no    yes   no        1
2     2 nif   no    no    yes   yes   no        1
3     3 nif   no    no    no    no    no        0
4     4 inf   no    yes   no    yes   yes       1
5     5 inf   yes   no    no    no    no        1

Upvotes: 1

tmfmnk
tmfmnk

Reputation: 40171

One option could be:

df %>%
 rowwise() %>%
 mutate(res = case_when(type == "inf" & rowSums(across(starts_with("tx")) == "yes") >= 1 ~ 1,
                        type == "nif" & with(rle(c_across(starts_with("tx"))), any(values == "yes" & lengths >= 2)) ~ 1,
                        TRUE ~ 0))

     id type  tx_1  tx_2  tx_3  tx_4  tx_5    res
  <int> <chr> <chr> <chr> <chr> <chr> <chr> <dbl>
1     1 inf   no    yes   no    yes   no        1
2     2 nif   no    no    yes   yes   no        1
3     3 nif   no    no    no    no    no        0
4     4 inf   no    yes   no    yes   yes       1
5     5 inf   yes   no    no    no    no        1

Upvotes: 2

Related Questions