Wang
Wang

Reputation: 1460

R dplyr filter rows based on conditions from several selected columns

I have a dataframe DF, and I want to filter it based on condition from several selected columns.

For instance, I want to filter rows in DF that fulfil the condition that this row contains any values that are smaller than 0.03 in column PCS_AB, PCS_AD, PCS_BD.

DF <- cbind.data.frame(A = c(100, 10, 13),
                       B = c(33, 44, 12),
                       D = c(110, 21, 22),
                       PCS_AB = c(0.03, 0.001, 0.3),
                       PCS_AD = c(0.01, 0.2, 0.33),
                       PCS_BD = c(0.99, 1.0, 0.45))

I can achieve it by the following code:

DF_filter <- DF %>%
  filter(PCS_AB < 0.03 | PCS_AD < 0.03 | PCS_BD < 0.03)

But I want something simpler like the pseudo code as following:

DF2 <- DF %>%
      filter(any(starts_with("PCS")) < 0.03)

Is it possible with dplyr? Thanks.

Upvotes: 3

Views: 1107

Answers (1)

akrun
akrun

Reputation: 887951

We can use if_any from the version 1.0.4 of dplyr

library(dplyr)
DF %>%
   filter(if_any(starts_with("PCS"), ~ . <= 0.03))

-output

#   A  B   D PCS_AB PCS_AD PCS_BD
#1 100 33 110  0.030   0.01   0.99
#2  10 44  21  0.001   0.20   1.00

It is also possible filter_at with any_vars (soon to be deprecated)

DF %>%
  filter_at(vars(starts_with("PCS")), any_vars(. <= 0.03))

-output

#    A  B   D PCS_AB PCS_AD PCS_BD
#1 100 33 110  0.030   0.01   0.99
#2  10 44  21  0.001   0.20   1.00

Or use rowSums to create the logical vector

DF %>%
    filter(rowSums(select(., starts_with('PCS')) < 0.03) > 0)

Upvotes: 4

Related Questions