compact and efficient computationally and memory wise manner to do iterative or recursive subsets

Question

I would like to subset a data.table dat based on a condition applied to one column of it, resulting to subset foobar, and then use foobar as a subset condition of dat on all columns; rinse and repeat till subset is empty or a number of iterations has passed.

The following MWE, with input and output embedded as comments, is my current solution. I was wondering whether there is a faster, maybe better memory efficient and more compact data.tableish way to do this.

library(data.table)

dat <- data.table(c("a", "a", "", "b", "b", "c", "d"),
                  c(1, 1, 1, 1, 2, 3, 4),
                  c(11, 11, 11, 11, 12, 12, 14),
                  i = 1:7)
# print(dat)
# input
#    V1 V2 V3 i
# 1:  a  1 11 1
# 2:  a  1 11 2
# 3:     1 11 3
# 4:  b  1 11 4
# 5:  b  2 12 5
# 6:  c  3 12 6
# 7:  d  4 14 7

foobar <- dat[V1 == "a" | V2 == 1 | V3 == -1]
for (i in 1:5) {
  baz <- dat[!(i %in% foobar$i) & (V1 %chin% foobar$V1 | V2 %in% foobar$V2 | V3 %in% foobar$V3)]
  if (nrow(baz) == 0) break
  foobar <- rbindlist(list(foobar, baz))
}
# print(foobar)
# output
#    V1 V2 V3 i
# 1:  a  1 11 1
# 2:  a  1 11 2
# 3:     1 11 3
# 4:  b  1 11 4
# 5:  b  2 12 5
# 6:  c  3 12 6

compact and efficient computationally and memory wise manner to do iterative or recursive subsets

Answers (1)

Related Questions