userLL
userLL

Reputation: 501

Filter by testing logical condition across multiple columns

Is there a function in dplyr that allows you to test the same condition against a selection of columns?

Take the following dataframe:

Demo1 <- c(8,9,10,11)
Demo2 <- c(13,14,15,16)
Condition <- c('A', 'A', 'B', 'B')
Var1 <- c(13,76,105,64)
Var2 <- c(12,101,23,23)
Var3 <- c(5,5,5,5)

df <- as.data.frame(cbind(Demo1, Demo2, Condition, Var1, Var2, Var3), stringsAsFactors = F)
df[4:6] <- lapply(df[4:6], as.numeric)

I want to take all the rows in which there is at least one value greater than 100 in any of Var1, Var2, or Var3. I realise that I could do this with a series of or statements, like so:

df <- df %>% 
  filter(Var1 > 100 | Var2 > 100 | Var3 > 100)

However, since I have quite a few columns in my actual dataset this would be time-consuming. I am assuming that there is some reasonably straightforward way to do this but haven't been able to find a solution on SO.

Upvotes: 8

Views: 1135

Answers (2)

MKR
MKR

Reputation: 20085

In base-R one can write the same filter using rowSums as:

df[rowSums((df[,grepl("^Var",names(df))] > 100)) >= 1, ]

#   Demo1 Demo2 Condition Var1 Var2 Var3
# 2     9    14         A   76  101    5
# 3    10    15         B  105   23    5

Upvotes: 3

akrun
akrun

Reputation: 886938

We can do this with filter_at and any_vars

df %>% 
  filter_at(vars(matches("^Var")), any_vars(.> 100))
#   Demo1 Demo2 Condition Var1 Var2 Var3
#1     9    14         A   76  101    5
#2    10    15         B  105   23    5

Or using base R, create a logical expression with lapply and Reduce and subset the rows

df[Reduce(`|`, lapply(df[grepl("^Var", names(df))], `>`, 100)),]

Upvotes: 3

Related Questions