John Smith
John Smith

Reputation: 1117

compact and efficient computationally and memory wise manner to do iterative or recursive subsets

I would like to subset a data.table dat based on a condition applied to one column of it, resulting to subset foobar, and then use foobar as a subset condition of dat on all columns; rinse and repeat till subset is empty or a number of iterations has passed.

The following MWE, with input and output embedded as comments, is my current solution. I was wondering whether there is a faster, maybe better memory efficient and more compact data.tableish way to do this.

library(data.table)

dat <- data.table(c("a", "a", "", "b", "b", "c", "d"),
                  c(1, 1, 1, 1, 2, 3, 4),
                  c(11, 11, 11, 11, 12, 12, 14),
                  i = 1:7)
# print(dat)
# input
#    V1 V2 V3 i
# 1:  a  1 11 1
# 2:  a  1 11 2
# 3:     1 11 3
# 4:  b  1 11 4
# 5:  b  2 12 5
# 6:  c  3 12 6
# 7:  d  4 14 7

foobar <- dat[V1 == "a" | V2 == 1 | V3 == -1]
for (i in 1:5) {
  baz <- dat[!(i %in% foobar$i) & (V1 %chin% foobar$V1 | V2 %in% foobar$V2 | V3 %in% foobar$V3)]
  if (nrow(baz) == 0) break
  foobar <- rbindlist(list(foobar, baz))
}
# print(foobar)
# output
#    V1 V2 V3 i
# 1:  a  1 11 1
# 2:  a  1 11 2
# 3:     1 11 3
# 4:  b  1 11 4
# 5:  b  2 12 5
# 6:  c  3 12 6

Upvotes: 1

Views: 36

Answers (1)

CPak
CPak

Reputation: 13581

Here's how you can do it with recursion. I tried to use the same variable names you were already using...

foo <- dat[V1 == "a" | V2 == 1 | V3 == -1]
counter <- 5
special <- function(dat, foobar, counter) {
                if (counter > 0 & nrow(foobar) > 0) {
                baz <- dat[!(i %in% foobar$i) & (V1 %chin% foobar$V1 | V2 %in% foobar$V2 | V3 %in% foobar$V3)]
                    special(dat, rbindlist(list(foobar,baz)), counter-1)
                } else {
                    return(foobar)
                }
           }

NOTE the recursive call is

special(dat, rbindlist(list(foobar,baz)), counter-1)

which updates the iteration-value and foobar. If counter==0 or foobar is empty

return(foobar)

Upvotes: 1

Related Questions