Erick Gallani
Erick Gallani

Reputation: 809

Python lambda with list comprehension

I've a doubt about lambda expression with internal list comprehension operation.

In the following code, the lambda will instantiate a list for each item every time?

def _find_items_not_present_in_store(self, store_today, store_yesterday):
        # finding what items are not in the store anymore
        items_not_in_store_anymore = filter(lambda item1: item1.item_id not in
                                         [item2.item_id for item2 in store_today.store_items],
                                         store_yesterday)
        return items_not_in_store_anymore

Would be better have this list

[item2.item_id for item2 in store.store_items]

instantiated outside of the lambda expression?

I couldn't find any documentation about it.

Upvotes: 0

Views: 735

Answers (3)

jsbueno
jsbueno

Reputation: 110136

You are performing a linear search for each item in your list - and that is definitely sub-optimal. For a store with 1 million items in stock, hat can lead to the order of (1000000)² comparisons, which can be quite a burden even to fast computers. That is just to start

The thing to do is to create a set with the ID's of one of the collections, and use set's "contains" (the same in operator) - which searches in constant time.

def _find_items_not_present_in_store(self, store_today, store_yesterday):
    yesterday_ids = set(item.item_id for item in store_yesterday)
    return [item for item in store_today if item.item_id not in yesterday_ids]

And - in your code - in addition to searching in lists and not in a set, you are actually recreating the whole yesterday's ID list for each item in the today's list - as your list generator expression is inside the lambda function. In the approach above, I pre-calculate the ID set just once -aas that is what makes sense.

Besides that, as you can see, list comprehension and generator expressions in Python have an if clause that supersedes the usage of the filter function - filter only makes sense when one choose to use a functional notation instead of generators/comprehensions - and in most cases will have the overhead of one additional function call.

Upvotes: 1

Tom Karzes
Tom Karzes

Reputation: 24052

The way you've written it, the list is part of the lambda expression, so it will be evaluated each time the lambda is invoked.

Here's the most efficient way to implement your function:

def _find_items_not_present_in_store(self, store_today, store_yesterday):
    s = set(item2.item_id for item2 in store_today.store_items)
    items_not_in_store_anymore = [item1 for item1 in store_yesterday
                                  if item1.item_id not in s]
    return items_not_in_store_anymore

This does 2 main things to improve the efficiency:

  1. It creates a set, once, for fast membership checking
  2. It replaces the lambda/filter combination with a comprehension, which is more efficient.

Upvotes: 1

Bakuriu
Bakuriu

Reputation: 101909

Every call of the lambda function will re-create that list, so moving that construction outside the lambda will improve performance.

Moreover using a list to check for in isn't a good idea, since it takes linear time. Consider using a set instead:

def _find_items_not_present_in_store(self, store_today, store_yesterday):
        today_ids = {item2.item_if for item2 in store_today.store_items}

        items_not_in_store_anymore = filter(
            lambda item1: item1.item_id not in today_ids, 
            store_yesterday
        )
        return items_not_in_store_anymore

In old versions of python you need to do set( ... ) instead of the set-comprehension { ... }.

Upvotes: 1

Related Questions