littleworth
littleworth

Reputation: 5169

How to check if a vector contained in list column of a data frame with dplyr

I have the following data frame:

library(tidyverse)

dat <- tribble(~cell, ~status, 
        "A", "x+", 
        "A", "y-",
        "A", "z+", 
        "B", "x-",
        "B", "y-", 
        "B", "z+")

Then I group the data frame based on cell and construct the list column. What I want to do is given a vector of status, I want to know if any of the two cell contains it or not.

wanted_status <- c("x+", "y-")
dat %>% 
  group_by(cell) %>% 
  mutate(nstatus = list(status)) %>% # construct list column
  dplyr::select(-status) %>% 
  unique() %>% 
  mutate(contained = if_else(wanted_status %in% nstatus, "in", "out")) # check if wanted_status vector is contained in nstatus or not.

With that example, I expect the result to be:

  cell  contained  
  A     in
  B     out

How can I achieve that?

My current code gave this error:

Error: Problem with `mutate()` input `contained`.
x Input `contained` can't be recycled to size 1.
ℹ Input `contained` is `if_else(wanted_status %in% nstatus, "in", "out")`.
ℹ Input `contained` must be size 1, not 2.
ℹ The error occurred in group 1: cell = "A".

Upvotes: 0

Views: 766

Answers (2)

akrun
akrun

Reputation: 886938

We could do this without using an if/else condition

library(dplyr)
dat %>%
   group_by(cell) %>%
   summarise(contained = c('out', 'in')[1 + all(wanted_status %in% status)],
       .groups = 'drop')
# A tibble: 2 x 2
  cell  contained
  <chr> <chr>    
1 A     in       
2 B     out      

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 388807

Do you want to check for any value in wanted_status or all of them? The expected output suggests all.

library(dplyr)

wanted_status <- c("x+", "y-")

dat %>%
  group_by(cell) %>%
  summarise(contained = if(all(wanted_status %in% status)) 'in' else 'out')

#  cell  contained
#  <chr> <chr>    
#1 A     in       
#2 B     out      

Upvotes: 3

Related Questions