mkowalik
mkowalik

Reputation: 133

Python: most effective way of filtering a list

I hope I'm not duplicating here :-)

I'm wondering what's the most effective way of filtering a python list. The task I have in hand is to find out list elements that are not appearing in some other list.

My fist list is a list of objects (without unnecessary details):

Class A:
    def __init__(self,item1, item2):
        self.item1 = item1
        self.item2 = item2

later on, in my script I'm parsing a input text file and populating a list1 with real data (both item1 and item2 fields are strings)

there's also a second list, list2 containing just a list of strings coresponding to item1. What I'm interested in, are the elements in list1 where item1 is not in the list2.
(list1 contains roughly 3000 elements, list2 is bigger - circa 60000 elements. )

my fist attempt is quite obvious:

notMatched = list(itertools.ifilter(lambda x: x.item1 not in list2), list1))

now, it works as expected, giving me exactly what I want, but I'm still wondering if it's the best solution I could came with. Any idea anyone?

Thanks

Upvotes: 3

Views: 906

Answers (1)

Daren Thomas
Daren Thomas

Reputation: 70314

Make list2 a set. This will improve the performance of the lookup not in list2.

You can probably get away with this:

set2 = set(list2)
not_matched = [a for a in list1 if not a.item1 in set2]

Upvotes: 5

Related Questions