Reputation: 1117
I would like to subset a data.table dat
based on a condition applied to one column of it, resulting to subset foobar
, and then use foobar
as a subset condition of dat on all columns; rinse and repeat till subset is empty or a number of iterations has passed.
The following MWE, with input and output embedded as comments, is my current solution. I was wondering whether there is a faster, maybe better memory efficient and more compact data.tableish way to do this.
library(data.table)
dat <- data.table(c("a", "a", "", "b", "b", "c", "d"),
c(1, 1, 1, 1, 2, 3, 4),
c(11, 11, 11, 11, 12, 12, 14),
i = 1:7)
# print(dat)
# input
# V1 V2 V3 i
# 1: a 1 11 1
# 2: a 1 11 2
# 3: 1 11 3
# 4: b 1 11 4
# 5: b 2 12 5
# 6: c 3 12 6
# 7: d 4 14 7
foobar <- dat[V1 == "a" | V2 == 1 | V3 == -1]
for (i in 1:5) {
baz <- dat[!(i %in% foobar$i) & (V1 %chin% foobar$V1 | V2 %in% foobar$V2 | V3 %in% foobar$V3)]
if (nrow(baz) == 0) break
foobar <- rbindlist(list(foobar, baz))
}
# print(foobar)
# output
# V1 V2 V3 i
# 1: a 1 11 1
# 2: a 1 11 2
# 3: 1 11 3
# 4: b 1 11 4
# 5: b 2 12 5
# 6: c 3 12 6
Upvotes: 1
Views: 36
Reputation: 13581
Here's how you can do it with recursion. I tried to use the same variable names you were already using...
foo <- dat[V1 == "a" | V2 == 1 | V3 == -1]
counter <- 5
special <- function(dat, foobar, counter) {
if (counter > 0 & nrow(foobar) > 0) {
baz <- dat[!(i %in% foobar$i) & (V1 %chin% foobar$V1 | V2 %in% foobar$V2 | V3 %in% foobar$V3)]
special(dat, rbindlist(list(foobar,baz)), counter-1)
} else {
return(foobar)
}
}
NOTE the recursive call is
special(dat, rbindlist(list(foobar,baz)), counter-1)
which updates the iteration-value and foobar
. If counter==0
or foobar
is empty
return(foobar)
Upvotes: 1