Abubakar Moallim
Abubakar Moallim

Reputation: 4923

How to remove list with in a list if an element in one list is present in the other

I have a list containing several lists. The lists may contain integer elements ranging from 0 to 9, if two or more lists have a common element remove the lists of smaller length

[[0, 3, 7],
 [0, 3, 7, 9],
 [0, 2, 3, 4, 7, 8, 9],
 [2, 4, 8, 9],
 [2, 4, 7, 8, 9],
 [5, 6]]

Output should be:

[[5, 6], [0, 2, 3, 4, 7, 8, 9]]

Any ideas?

Upvotes: 0

Views: 82

Answers (3)

Padraic Cunningham
Padraic Cunningham

Reputation: 180441

l = [[0, 3, 7], [0, 3, 7, 9], [0, 2, 3, 4, 7, 8, 9], [2, 4, 8, 9], [2, 4, 7, 8, 9], [5, 6]]

longest = max(l,key=len)
st = set(longest)
print([longest]+[ele for ele in l if not st.intersection(ele)])
[[0, 2, 3, 4, 7, 8, 9], [5, 6]]

To catch overlapping sublists:

longest = max(l, key=len)

seen = set()
seen.update(longest)
out = [longest]
for sub in l:
    if not seen.intersection(sub):
        out.append(sub)
    seen.update(sub)

Upvotes: 1

aneroid
aneroid

Reputation: 15962

Option 1: Nested iteration of each list's elements and compare with each of every other. (You get the idea). Probably not the most efficient way.

Option 2: Convert (copy) each of the lists into sets. Pick your biggest set A and subtract each of the others B. If the length of the result of A - B is less than the length of A then discard the list represented by B. If the result has the same length as A then all of items in B were unique (not in A), so don't discard that B.

Upvotes: 0

Martijn Pieters
Martijn Pieters

Reputation: 1122342

Sort your input lists by length, then take the longest list and add it to the output. Create a set from this longest list against which you test other lists. Any subsequent shorter list that intersects with this set is discarded.

If you find a shorter list that doesn't intersect, add it to the output, and update your base set; shorter lists that now intersect share at least one number with the one or more lists in the output, after all. Continue until all lists have been tested:

def eliminate_shorter(list_of_lists):
    inputlist = sorted(list_of_lists, key=len)
    outputlist = [inputlist.pop()]
    numbers = set(outputlist[0])
    for sublist in reversed(inputlist):
        if not numbers.intersection(sublist):
            numbers.update(sublist)
            outputlist.append(sublist)
    return outputlist

This is an algorithm of O(NlogN) complexity (because of the initial sort).

Demo:

>>> sample = [[0, 3, 7], [0, 3, 7, 9], [0, 2, 3, 4, 7, 8, 9], [2, 4, 8, 9], [2, 4, 7, 8, 9], [5, 6]]
>>> def eliminate_shorter(list_of_lists):
...     inputlist = sorted(list_of_lists, key=len)
...     outputlist = [inputlist.pop()]
...     numbers = set(outputlist[0])
...     for sublist in reversed(inputlist):
...         if not numbers.intersection(sublist):
...             numbers.update(sublist)
...             outputlist.append(sublist)
...     return outputlist
... 
>>> eliminate_shorter(sample)
[[0, 2, 3, 4, 7, 8, 9], [5, 6]]

Upvotes: 4

Related Questions