Reputation: 433
I have a list that looks as follows:
a <- c(1, 3, 4)
b <- c(0, 2, 6)
c <- c(3, 4)
d <- c(0, 2, 6)
list(a, b, c, d)
From this list I would like to remove all subsets such that the list looks as follows:
[[1]]
[1] 1 3 4
[[2]]
[1] 0 2 6
How do I do this? In my actual data I am working with a very long list (> 500k elements) so any suggestions for an efficient implementation are welcome.
Upvotes: 0
Views: 166
Reputation: 81703
Here is an approach.
lst <- list(a, b, c, d) # The list
First, remove all duplicates.
lstu <- unique(lst)
If the list still contains more than one element, we order the list by the lengths of its elements (decreasing).
lstuo <- lstu[order(-lengths(lstu))]
Then subsets can be filtered with this command:
lstuo[c(TRUE, !sapply(2:length(lstuo),
function(x) any(sapply(seq_along(lstuo)[-x],
function(y) all(lstuo[[x]] %in% lstu[[y]])))))]
The result:
[[1]]
[1] 1 3 4
[[2]]
[1] 0 2 6
Upvotes: 2
Reputation: 13581
Your data
lst <- list(a, b, c, d) # The list
lstu <- unique(lst) # remove duplicates, piggyback Sven's approach
Make matrix of values and index
m <- combn(lstu, 2) # 2-row matrix of non-self pairwise combinations of values
n <- combn(length(lstu), 2) # 2-row matrix of non-self pairwise combination of index
Determine if subset
issubset <- t(sapply(list(c(1,2),c(2,1)), function(z) mapply(function(x,y) all(x %in% y), m[z[1],], m[z[2],])))
Discard subset vectors from list
discard <- c(n*issubset)[c(n*issubset)>0]
ans <- lstu[-discard]
Output
[[1]]
[1] 1 3 4
[[2]]
[1] 0 2 6
Upvotes: 1