Filter conditions based on a list of column in data.table

Question

Problem: Suppose I have the following data.table object. I want to apply the following filter condition:

(CMT_1 != "") | (CMT_2 != "") | (CMT_3 != "")

As there can be more CMT_* and I do not know them a-prioi I want to make this condition flexible (depending on how many CMT_* columns I have). Any suggestions how to write this nicely?

library(data.table)

dt <- data.table(
  CMT_1 = 1:3,
  CMT_2 = 4:6,
  CMT_4 = 8:10,
  remainder1 = 12:14,
  remainder2 = 15:17
)

cmts <- names(dt)[startsWith(names(dt), "CMT_")]

## filter condition which I want to make flexible
dt[(CMT_1 != "") | (CMT_2 != "") | (CMT_3 != ""))

chinsoon12 · Accepted Answer

Here is one option using rowSums and .I to extract those rows before subsetting:

cmts <- grep("^CMT_", names(dt), value=TRUE)
dt[dt[, .I[rowSums(.SD!="") > 1L], .SDcols=cmts]]

Filter conditions based on a list of column in data.table

Answers (2)

Related Questions