user1389960
user1389960

Reputation: 433

Removing all subsets from a list

I have a list that looks as follows:

a <- c(1, 3, 4)
b <- c(0, 2, 6)
c <- c(3, 4)
d <- c(0, 2, 6)
list(a, b, c, d)

From this list I would like to remove all subsets such that the list looks as follows:

[[1]]
[1] 1 3 4

[[2]]
[1] 0 2 6

How do I do this? In my actual data I am working with a very long list (> 500k elements) so any suggestions for an efficient implementation are welcome.

Upvotes: 0

Views: 166

Answers (2)

Sven Hohenstein
Sven Hohenstein

Reputation: 81703

Here is an approach.

lst <- list(a, b, c, d)  # The list

First, remove all duplicates.

lstu <- unique(lst)

If the list still contains more than one element, we order the list by the lengths of its elements (decreasing).

lstuo <- lstu[order(-lengths(lstu))]

Then subsets can be filtered with this command:

lstuo[c(TRUE, !sapply(2:length(lstuo),
                      function(x) any(sapply(seq_along(lstuo)[-x], 
                                             function(y) all(lstuo[[x]] %in% lstu[[y]])))))]

The result:

[[1]]
[1] 1 3 4

[[2]]
[1] 0 2 6

Upvotes: 2

CPak
CPak

Reputation: 13581

Alternative approach

Your data

lst <- list(a, b, c, d)  # The list
lstu <- unique(lst)        # remove duplicates, piggyback Sven's approach

Make matrix of values and index

m <- combn(lstu, 2)           # 2-row matrix of non-self pairwise combinations of values
n <- combn(length(lstu), 2)   # 2-row matrix of non-self pairwise combination of index

Determine if subset

issubset <- t(sapply(list(c(1,2),c(2,1)), function(z) mapply(function(x,y) all(x %in% y), m[z[1],], m[z[2],])))

Discard subset vectors from list

discard <- c(n*issubset)[c(n*issubset)>0]
ans <- lstu[-discard]

Output

[[1]]
[1] 1 3 4

[[2]]
[1] 0 2 6

Upvotes: 1

Related Questions