Reputation: 43
I have a confusion what's wrong with my code:
users = [{'id': 1, 'name': 'Number1', 'age': 11},
{'id': 2, 'name': 'Number2', 'age': 12},
{'id': 3, 'name': 'Number3', 'age': 13},
{'id': 4, 'name': 'Number4', 'age': 14}]
_keys = ('name', 'age')
data_by_user_id = {u.get('id'): (u.get(k) for k in _keys) for u in users}
data_by_user_id looks like:
{1: <generator object <genexpr> at 0x7f3c12c31050>, 2: <generator object <genexpr> at 0x7f3c12c310a0>, 3: <generator object <genexpr> at 0x7f3c12c310f0>, 4: <generator object <genexpr> at 0x7f3c12c31140>}
but after iteration:
for user_id, data in data_by_user_id.iteritems():
name, age = data
print user_id, name, age
result is different than I expected:
1 Number4 14
2 Number4 14
3 Number4 14
4 Number4 14
Can anyone explain me what am I doing here wrong? I know I can use list comprehension instead of generator but I'm trying to figure what's the issue with my code
Thanks!
Upvotes: 4
Views: 2253
Reputation: 477794
Your expression in the dictionary comprehension statement:
(u.get(k) for k in _keys)
is a generator expression. It means that you construct a generator. A generator is an iterable object that evaluates elements lazily: it does not get the elements from u
, it postpones this operation until you for instance call next(..)
on it to obtain the next element. So you construct such a dictionary.
In the body of the for
loop, you write:
name, age = data
with data
being the value of the item. Now this means that you ask Python to "unpack" the iterable. This will work given the iterable yields exactly the same amount as elements as the number of variables on the left, so in this case two. As a result you will exhaust the generator and obtain the results of the iterator. Next you print these elements.
Note that after the for
loop, all the values of the dictionary will be exhausted generators, so your for
loop has side effects. In order to prevent that, you better materialize the generators.
EDIT: another problem here, is that you use u
in the dictionary comprehension, which is not scoped very well. As a result, if the u
variable is changed, the result of the generators will change as well. This is problematic since at the end of the dictionary comprehension, all generators will work with the last dictionary.
You can solve the problem by generating a local scope:
{u.get('id'): (lambda u=u: (u.get(k) for k in _keys))() for u in users}
Now it generates the expected output:
>>> users = [{'id': 1, 'name': 'Number1', 'age': 11},
... {'id': 2, 'name': 'Number2', 'age': 12},
... {'id': 3, 'name': 'Number3', 'age': 13},
... {'id': 4, 'name': 'Number4', 'age': 14}]
>>>
>>> _keys = ('name', 'age')
>>> data_by_user_id = {u.get('id'): (lambda u=u: (u.get(k) for k in _keys))() for u in users}
>>> for user_id, data in data_by_user_id.iteritems():
... name, age = data
... print user_id, name, age
...
1 Number1 11
2 Number2 12
3 Number3 13
4 Number4 14
Upvotes: 3
Reputation: 78554
As you probably already know, generator expressions are evaluated lazily. The evaluation of dict.get
is deferred until the generator expression is consumed at what time u
in the current scope is the last dictionary your list:
>>> u = {'id': 1, 'name': 'Number1', 'age': 11}
>>> _keys = ('name', 'age')
>>> gen = (u.get(k) for k in _keys)
>>> # update u
>>> u = {'id': 4, 'name': 'Number4', 'age': 14}
>>> list(gen)
['Number4', 14]
One obvious way to fix this is to use a list comprehension instead. Another way, not as good as the first, is to put the generator expression in a function, and bind the current value of u
to that function by means of a default argument:
data_by_user_id = {u.get('id'): lambda x=u: (x.get(k) for k in _keys) for u in users}
for user_id, data in data_by_user_id.iteritems():
name, age = data()
print name, age
Number1 11
Number2 12
Number3 13
Number4 14
Upvotes: 2