apriori
apriori

Reputation: 53

How can you remove superset lists from a list of lists in Python?

I have a list of lists in Python like the following:

[[1,2,3],[2,3],[2,4,3],[4,5],[5]]

I want to remove all the inner lists that are a superset (a list that contain all the elements of another list, but with additional elements) of another inner list. For the example above, removing the supersets should result in the following:

[[2,3],[5]]

How can I accomplish this?

Upvotes: 5

Views: 1452

Answers (4)

intentionally-left-nil
intentionally-left-nil

Reputation: 8274

I ended up with the same idea as @Olivier Melançon. You can use ascending order to throw away subsets and have this run in O(n^2) * O(subset calculation).

input = [[1,2,3],[2,3],[2,4,3],[4,5],[5]]
sets = [set(x) for x in input]
sets.sort(key=len)

subsets = []
while sets != []:
    cur = sets[0]
    subsets.append(cur)
    sets = [x for x in sets[1:] if not cur <= x]

output = [list(x) for x in subsets]
print(output)

Upvotes: 3

Tin Luu
Tin Luu

Reputation: 1687

Here it is:

super=[[1,2,3],[2,3],[2,4,3],[4,5],[5]]
subset=[s for s in super if not any(set(s).issuperset(set(i)) and len(s)>len(i) for i in super)]

Output:

>>> subset
[[2, 3], [5]]

Upvotes: 2

Olivier Melan&#231;on
Olivier Melan&#231;on

Reputation: 22294

A set can only be a subset of another if it is smaller, thus by iterating over the sets in ascending order of size, we can check each element against the previously found minimal subsets to know if it is a minimal subset.

def get_minimal_subsets(sets):
    sets = sorted(map(set, sets), key=len)
    minimal_subsets = []
    for s in sets:
        if not any(minimal_subset.issubset(s) for minimal_subset in minimal_subsets):
            minimal_subsets.append(s)

    return minimal_subsets

l = [[1,2,3],[2,3],[2,4,3],[4,5],[5]]

print(get_minimal_subsets(l))  # [{5}, {2, 3}]

Upvotes: 3

Ajax1234
Ajax1234

Reputation: 71451

You can use a list comprehension:

d = [[1,2,3],[2,3],[2,4,3],[4,5],[5]]
new_d = [i for i in d if not any(all(c in i for c in b) and len(b) < len(i) for b in d)]

Output:

[[2, 3], [5]]

Upvotes: 2

Related Questions