Reputation: 2570
I have a list
a=[[1,2,3,4,5,6],[7,8,9,10,11,12]]
What is the fastest way to check if any list in a
is present in another list of lists b
, where
b=[[5, 9, 25, 31, 33, 36],[7,8,9,10,11,12],[10, 13, 22, 24, 33, 44]]
If any list in a is present in b, I would like to remove it. I'm currently using this code:
for each in a:
for item in b:
if set(each).issubset(item)
a.remove(each)
This works but is quite slow when working with large lists so was wondering if there's a better way. The above code gives me the following result:
print(a)
[[1, 2, 3, 4, 5, 6]]
I am not worried about order, for example if a list in a
is [1,2,3,4,5,6]
I want it to be removed if there exist a list [1,2,3,4,5,6]
or [3,4,1,6,2,5]
etc in list b
.
Upvotes: 8
Views: 2529
Reputation: 11940
If you don't care about elements order and frequencies, i.e. treat lists as unordered sets, then probably your solution is almost the correct one (removing an element while iterating the same list is probably not the best idea) with two serious suboptimalities.
First, you currently convert each b's element into a set once per each element of a. I wonder if Python compiler can optimize the repeated work out, at least you could try to do it on your own.
Next, you don't need to remove elements incorrectly and quadratically to simply filter them out.
faster_b = [frozenset(x) for x in b]
def not_in_b(list):
l = frozenset(list)
for x in faster_b:
if l <= x: return False
return True
print(list(filter(not_in_b, a)))
This one is probably faster.
$ python3
Python 3.6.5 (default, May 11 2018, 04:00:52)
[GCC 8.1.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> a=[[1,2,3,4,5,6],[7,8,9,10,11,12]]
>>> b=[[5, 9, 25, 31, 33, 36],[7,8,9,10,11,12],[10, 13, 22, 24, 33, 44]]
>>> faster_b = [frozenset(x) for x in b]
>>>
>>> def not_in_b(list):
... l = frozenset(list)
... for x in faster_b:
... if l <= x: return False
... return True
...
>>> print(list(filter(not_in_b, a)))
[[1, 2, 3, 4, 5, 6]]
>>> a=[[1, 1, 2, 3]]
>>> b=[[7, 3, 2, 1], [4, 5, 6]]
>>> faster_b = [frozenset(x) for x in b]
>>> print(list(filter(not_in_b, a)))
[]
>>> a=[[1, 1, 2, 3], [42, 5, 6]]
>>> print(list(filter(not_in_b, a)))
[[42, 5, 6]]
Upvotes: 0
Reputation: 164623
A functional solution is possible using set.difference
:
res = set(map(tuple, a)).difference(set(map(tuple, b)))
[(1, 2, 3, 4, 5, 6)]
Explanation
list
is not a hashable type, we convert sublists to type tuple
, which are immutable and hashable, e.g. set(map(tuple, a))
.set.difference
to take the difference between the 2 resulting sets.Upvotes: 1
Reputation: 82765
Using a list comprehension
with set
.
Ex:
a=[[1,2,3,4,5,6],[7,8,9,10,11,12]]
b=[[5, 9, 25, 31, 33, 36],[7,8,9,10,11,12],[10, 13, 22, 24, 33, 44]]
setA = set(map(tuple, a))
setB = set(map(tuple, b))
print([i for i in setA if i not in setB])
Output:
[(1, 2, 3, 4, 5, 6)]
Upvotes: 3