Reputation: 455
Consider the following list where each member contains sets of numbers.
sets <- list(a=1:3, b=2:3, c=4:6, d=4:6, e=7)
I want to identify all sets that are proper subsets of another set in the list, such that my desired result would look like this...
c(F,T,F,F,F)
Because my actual sets are quite large, I don't want to need to calculate power sets of each set. Does anyone have a thought for an efficient way to do this?
This is what I've done so far, and it works, but this can't be the most elegant way of doing it.
truthtable <- bind_rows(lapply(X=sets, FUN=function(x, allsets){
unlist(lapply(X=allsets, FUN=function(x,testset){
return(all(x %in% testset) & !setequal(x, testset))
}, testset=x))
}, allsets=sets))
apply(truthtable, 1, function(x){(all(!x))})
Upvotes: 0
Views: 109
Reputation: 15813
In order to make operations on sets fast, use binary decision diagrams.
In function of what operations on sets of sets you need, you can choose different variations of BSDs. In the most general case, use the ID of each set on the terminal nodes and do not unify the terminal nodes.
There are thousands of different articles where you can learn how to implement them; comparing to lists and other trivial data structures, to implement BSDs you have lots of different ways to do it, and some more mental effort to do to be able to use them, but after you understand them you will love this data structure.
This is a big intellectual effort to do to understand it, but it will run really fast when you implement list of sets, sets of sets (powerset), as you wish.
Upvotes: 0
Reputation: 146224
I don't know where allsets
comes from, but your overall approach looks okay. Here's a refactored version using a simple for
loop:
is_proper_subset = function(x, y) {
all(x %in% y) && !setequal(x, y)
}
result = rep(NA, length(sets))
for (i in seq_along(sets)) {
result[i] = any(sapply(sets[-i], is_proper_subset, x = sets[[i]]))
}
result
# [1] FALSE TRUE FALSE FALSE FALSE
Upvotes: 1