West
West

Reputation: 2570

Fastest way to check if a list is present in a list of lists

I have a list

a=[[1,2,3,4,5,6],[7,8,9,10,11,12]]   

What is the fastest way to check if any list in a is present in another list of lists b, where

b=[[5, 9, 25, 31, 33, 36],[7,8,9,10,11,12],[10, 13, 22, 24, 33, 44]]

If any list in a is present in b, I would like to remove it. I'm currently using this code:

for each in a:
    for item in b:
        if set(each).issubset(item)
            a.remove(each)

This works but is quite slow when working with large lists so was wondering if there's a better way. The above code gives me the following result:

print(a)
[[1, 2, 3, 4, 5, 6]]

I am not worried about order, for example if a list in ais [1,2,3,4,5,6] I want it to be removed if there exist a list [1,2,3,4,5,6] or [3,4,1,6,2,5] etc in list b.

Upvotes: 8

Views: 2529

Answers (3)

bipll
bipll

Reputation: 11940

If you don't care about elements order and frequencies, i.e. treat lists as unordered sets, then probably your solution is almost the correct one (removing an element while iterating the same list is probably not the best idea) with two serious suboptimalities.

First, you currently convert each b's element into a set once per each element of a. I wonder if Python compiler can optimize the repeated work out, at least you could try to do it on your own.

Next, you don't need to remove elements incorrectly and quadratically to simply filter them out.

faster_b = [frozenset(x) for x in b]

def not_in_b(list):
    l = frozenset(list)
    for x in faster_b:
        if l <= x: return False
    return True

print(list(filter(not_in_b, a)))

This one is probably faster.

$ python3
Python 3.6.5 (default, May 11 2018, 04:00:52) 
[GCC 8.1.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> a=[[1,2,3,4,5,6],[7,8,9,10,11,12]]
>>> b=[[5, 9, 25, 31, 33, 36],[7,8,9,10,11,12],[10, 13, 22, 24, 33, 44]]
>>> faster_b = [frozenset(x) for x in b]
>>> 
>>> def not_in_b(list):
...     l = frozenset(list)
...     for x in faster_b:
...         if l <= x: return False
...     return True
... 
>>> print(list(filter(not_in_b, a)))
[[1, 2, 3, 4, 5, 6]]
>>> a=[[1, 1, 2, 3]]
>>> b=[[7, 3, 2, 1], [4, 5, 6]]
>>> faster_b = [frozenset(x) for x in b]
>>> print(list(filter(not_in_b, a)))
[]
>>> a=[[1, 1, 2, 3], [42, 5, 6]]
>>> print(list(filter(not_in_b, a)))
[[42, 5, 6]]

Upvotes: 0

jpp
jpp

Reputation: 164623

A functional solution is possible using set.difference:

res = set(map(tuple, a)).difference(set(map(tuple, b)))

[(1, 2, 3, 4, 5, 6)]

Explanation

  1. Since list is not a hashable type, we convert sublists to type tuple, which are immutable and hashable, e.g. set(map(tuple, a)).
  2. Then use set.difference to take the difference between the 2 resulting sets.

Upvotes: 1

Rakesh
Rakesh

Reputation: 82765

Using a list comprehension with set.

Ex:

a=[[1,2,3,4,5,6],[7,8,9,10,11,12]]  
b=[[5, 9, 25, 31, 33, 36],[7,8,9,10,11,12],[10, 13, 22, 24, 33, 44]]

setA = set(map(tuple, a))
setB = set(map(tuple, b))

print([i for i in setA if i not in setB])

Output:

[(1, 2, 3, 4, 5, 6)]

Upvotes: 3

Related Questions