Subset row if column value for any of multiple columns equals value in a list

Question

I have a data frame with ten columns, but five columns of concern: A, B, C, D, E. I also have a list of values. What's the best way to subset the rows whose values in column A, B, C, D, OR, E is included in the list of values?

If I were only concerned with a single column, I know I can use left_join(list_of_values, df$A) but I'm not sure how to do something similar with multiple columns.

AndS. · Accepted Answer

The key here is if_any.

library(tidyverse)

set.seed(26)
sample_df <- tibble(col = rep(LETTERS[1:8], each = 5),
       val = sample(1:10, 40, replace = TRUE),
       ID = rep(1:5, 8)) |>
  pivot_wider(names_from = col, values_from = val)

sample_df
#> # A tibble: 5 x 9
#>      ID     A     B     C     D     E     F     G     H
#>           
#> 1     1     8     4    10     7     2     7     4     3
#> 2     2     3     2     3     3     4    10     2     3
#> 3     3     9     6     6     8     2    10    10     3
#> 4     4     7     6     8     9     3     5     8     3
#> 5     5     6     3     4     1     9     7     9     1

vals <- c(1, 7)

#solution
sample_df |>
  filter(if_any(A:E, ~. %in% vals))
#> # A tibble: 3 x 9
#>      ID     A     B     C     D     E     F     G     H
#>           
#> 1     1     8     4    10     7     2     7     4     3
#> 2     4     7     6     8     9     3     5     8     3
#> 3     5     6     3     4     1     9     7     9     1

or any and apply with base R:

#base solution
indx <- apply(sample_df[,which(colnames(sample_df) %in% LETTERS[1:5])], 1, \(x) any(x %in% vals))
sample_df[indx,]
#> # A tibble: 3 x 9
#>      ID     A     B     C     D     E     F     G     H
#>           
#> 1     1     8     4    10     7     2     7     4     3
#> 2     4     7     6     8     9     3     5     8     3
#> 3     5     6     3     4     1     9     7     9     1

Subset row if column value for any of multiple columns equals value in a list

Answers (1)

Related Questions