R. Cowboy
R. Cowboy

Reputation: 223

How to select rows comparing multiple columns with the same value?

Suppose I have the following data and a vector of column names

dt <- data.table(id = letters[1:10], amount = 1:10, id2 = c(rep('a',5),rep('b',5)),test=rep('a',10))
cols <- c('id','id2','test')

what I'm trying to do is select rows where all columns in the vector have the same specific value like

dt[id=='a' & id2=='a' & test == 'a']

but using the vector cols. Is there a way to do it?

Note: I need to find a way to do it using data.table or base R without making comparisons between the columns like

dt[id==id2 & id==test & id2==test]

Upvotes: 3

Views: 871

Answers (5)

chinsoon12
chinsoon12

Reputation: 25225

Another option is using a join:

dt[as.list(rep('a', length(cols))), on=cols]

Upvotes: 1

r2evans
r2evans

Reputation: 160447

Here's a data.table-friendly solution that is not looking for the literal "a":

This helper-function simply takes the table of its arguments, and returns TRUE if the number of distinct elements is other than "1". (I test for 2L because I use useNA="always", which will always append an NA count even if 0.)

allsame <- function(x) do.call(mapply, c(list(FUN = function(...) length(table(unlist(list(...)), useNA = "always")) == 2L), x))

From here, you can use it externally or internally:

dt[allsame(dt[,..cols]),]
#        id amount    id2   test
#    <char>  <int> <char> <char>
# 1:      a      1      a      a

It's in fact not data.table-specific, it'll work with anything as long as its argument is a data.frame-like object with all of the comments you need compared.

mtcars[1:3,10:11]
#               gear carb
# Mazda RX4        4    4
# Mazda RX4 Wag    4    4
# Datsun 710       4    1

allsame(mtcars[1:3,10:11])
# [1]  TRUE  TRUE FALSE

Upvotes: 1

akrun
akrun

Reputation: 887223

Using if_all

library(dplyr)
dt %>%
    filter(if_all(all_of(cols), ~ . == 'a'))
   id amount id2 test
1:  a      1   a    a

Upvotes: 0

tmfmnk
tmfmnk

Reputation: 39858

With dplyr and purrr, you could do:

dt %>%
    filter(reduce(.x = across(all_of(cols), ~ . == "a"), .f = `&`))

   id amount id2 test
1:  a      1   a    a

Or the same idea using data.table:

dt[dt[, Reduce(`&`, lapply(.SD, function(x) x == "a")), .SDcols = cols]]

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 389047

You can take help of .SDcols -

library(data.table)

dt[dt[, rowSums(.SD == 'a') == length(cols), .SDcols = cols]]

#   id amount id2 test
#1:  a      1   a    a

This can also be written as -

dt[rowSums(dt[, ..cols] == 'a') == length(cols), ]

Upvotes: 3

Related Questions