Apurva Kunkulol
Apurva Kunkulol

Reputation: 451

Generator expressions Python

I have a list of dictionaries like the following:

lst = [{'a': 5}, {'b': 6}, {'c': 7}, {'d': 8}]

I wrote a generator expression like:

next((itm for itm in lst if itm['a']==5))

Now the strange part is that though this works for the key value pair of 'a' it throws an error for all other expressions the next time. Expression:

next((itm for itm in lst if itm['b']==6))

Error:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in <genexpr>
KeyError: 'b'

Upvotes: 16

Views: 3601

Answers (5)

cs95
cs95

Reputation: 403208

Indeed, your structure is a list of dictionaries.

>>> lst = [{'a': 5}, {'b': 6}, {'c': 7}, {'d': 8}]

To get a better idea of what is happening with your first condition, try this:

>>> gen = (itm for itm in lst if itm['a'] == 5)
>>> next(gen)
{'a': 5}
>>> next(gen)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <genexpr>
KeyError: 'a'

Each time you call next, you process the next element and return an item. Also...

next((itm for itm in lst if itm['a'] == 5))

Creates a generator that is not assigned to any variable, processes the first element in the lst, sees that key 'a' does indeed exist, and return the item. The generator is then garbage collected. The reason an error is not thrown is because the first item in lst does indeed contain this key.

So, if you changed the key to be something that the first item does not contain, you get the error you saw:

>>> gen = (itm for itm in lst if itm['b'] == 6)
>>> next(gen)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in <genexpr>
KeyError: 'b'

The Solution

Well, one solution as already discussed is to use the dict.get function. Here's another alternative using defaultdict:

from collections import defaultdict
from functools import partial

f = partial(defaultdict, lambda: None)

lst = [{'a': 5}, {'b': 6}, {'c': 7}, {'d': 8}]
lst = [f(itm) for itm in lst] # create a list of default dicts

for i in (itm for itm in lst if itm['b'] == 6):
    print(i)

This prints out:

defaultdict(<function <lambda> at 0x10231ebf8>, {'b': 6})

The defaultdict will return None in the event of the key not being present.

Upvotes: 1

willeM_ Van Onsem
willeM_ Van Onsem

Reputation: 477794

That's not weird. For every itm in the lst. It will first evaluate the filter clause. Now if the filter clause is itm['b'] == 6, it will thus try to fetch the 'b' key from that dictionary. But since the first dictionary has no such key, it will raise an error.

For the first filter example, that is not a problem, since the first dictionary has an 'a' key. The next(..) is only interested in the first element emitted by the generator. So it never asks to filter more elements.

You can use .get(..) here to make the lookup more failsafe:

next((itm for itm in lst if itm.get('b',None)==6))

In case the dictionary has no such key, the .get(..) part will return None. And since None is not equal to 6, the filter will thus omit the first dictionary and look further for another match. Note that if you do not specify a default value, None is the default value, so an equivalent statement is:

next((itm for itm in lst if itm.get('b')==6))

We can also omit the parenthesis of the generator: only if there are multiple arguments, we need these additional parenthesis:

next(itm for itm in lst if itm.get('b')==6)

Upvotes: 32

Hou Lu
Hou Lu

Reputation: 3232

Maybe you can try this:

next(next((itm for val in itm.values() if val == 6) for itm in lst))

This may be a little tricky, it generate two-tier generator, thus you need two next to get the result.

Upvotes: 0

poke
poke

Reputation: 388403

Take a look at your generator expression separately:

(itm for itm in lst if itm['a']==5)

This will collect all items in the list where itm['a'] == 5. So far so good.

When you call next() on it, you tell Python to generate the first item from that generator expression. But only the first.

So when you have the condition itm['a'] == 5, the generator will take the first element of the list, {'a': 5} and perform the check on it. The condition is true, so that item is generated by the generator expression and returned by next().

Now, when you change the condition to itm['b'] == 6, the generator will again take the first element of the list, {'a': 5}, and attempt to get the element with the key b. This will fail:

>>> itm = {'a': 5}
>>> itm['b']
Traceback (most recent call last):
  File "<pyshell#1>", line 1, in <module>
    itm['b']
KeyError: 'b'

It does not even get the chance to look at the second element because it already fails while trying to look at the first element.

To solve this, you have to avoid using an expression that can raise a KeyError here. You could use dict.get() to attempt to retrieve the value without raising an exception:

>>> lst = [{'a': 5}, {'b': 6}, {'c': 7}, {'d': 8}]
>>> next((itm for itm in lst if itm.get('b') == 6))
{'b': 6}

Upvotes: 15

freakish
freakish

Reputation: 56587

Obviously itm['b'] will raise a KeyError if there is no 'b' key in a dictionary. One way would be to do

next((itm for itm in lst if 'b' in itm and itm['b']==6))

If you don't expect None in any of the dictionaries then you can simplify it to

next((itm for itm in lst if itm.get('b')==6))

(this will work the same since you compare to 6, but it would give wrong result if you would compare to None)

or safely with a placeholder

PLACEHOLDER = object()
next((itm for itm in lst if itm.get('b', PLACEHOLDER)==6))

Upvotes: 6

Related Questions