Reputation: 2268
Question: What does Python do under the hood when it sees this kind of expression?
sum(sum(i) for j in arr for i in j)
My thoughts: The above expression works. But as it is written in Python's docs:
generator expressions are implemented using a function scope
Not to be verbose :) I have an array with the following layout (as an example):
>>> arr = [
[[1,2,3], [4,5,6]],
[[7,8,9],[10,11,12]]
]
At first, I try to sum all elements of arr
with the following expression:
>>> sum(sum(i) for i in j for j in arr)
NameError: name 'j' is not defined
It raises NameError
, but why not UnboundLocalError: local variable 'j' referenced before assignment
if it is implemented using a function scope, what is evaluation rules for for ... in ...
from left-to-right or from right-to-left? And what is an equivalent generator function for this generator expression?
EDIT:
I catch the idea. Thanks @vaultah for some insight. In this case j
is the argument that is send to generator expression:
>>> sum(sum(i) for i in j for j in arr) # NameError
that's why I get this weird NameError
.
@Eric answer shows that generator expression:
>>> sum(sum(i) for j in arr for i in j)
is equivalent to:
>>> def __gen(arr):
for j in arr:
for i in j:
yield sum(i)
>>> sum(__gen(arr))
Upvotes: 0
Views: 214
Reputation: 103998
Whether it is a generator or a list comprehension, the comprehension nesting is the same. It is easier to see what is going on with a list comprehension and that is what I will use in the examples below.
Given:
>>> arr
[[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]]
You can flatten the List of Lists of Ints by 1 level using a nested list comprehension (or generator):
>>> [e for sl in arr for e in sl]
[[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]]
You can flatten completely, given that structure, by nesting again (example only; there are better ways to flatten a deeply nested list):
>>> [e2 for sl2 in [e for sl in arr for e in sl] for e2 in sl2]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
Since sum
takes an iterable, the second flattening is not necessary in your example:
>>> [sum(e) for sl in arr for e in sl]
[6, 15, 24, 33] # sum of those is 78...
The general form of a comprehension is:
[ expression for a_variable in a_DEFINED_sequence optional_predicate ]
You can get the same NameError
you are seeing on your nested comprehension by using a non defined name:
>>> [c for c in not_defined]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'not_defined' is not defined
So the error you see on sum(sum(i) for i in j for j in arr)
is because j
has not been defined yet. Comprehensions are evaluated left to right, inner to outer. The definition of j
as a sequence is to the right of its attempted use.
To unroll the list comprehension into nested loops, the inner (or left hand) section becomes the outer loop:
for sl in arr:
for sl2 in sl:
for e in sl2:
# now you have each int in the LoLoInts...
# you could use yield e for a generator here
Your final question: Why do you get a TypeError
with gen = (j for j in arr)
?
That generator expression does nothing. Example:
>>> [j for j in arr]
[[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]]
>>> [j for j in arr] == arr
True
So the expression (j for j in arr)
just returns a generator over arr
.
And sum
does not know how to add that or arr either:
>>> sum(arr)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'int' and 'list'
Since gen
in your example is returning the same data structure, that is your error.
To fix it:
>>> gen=(e for sl in arr for e in sl)
>>> sum(sum(li) for li in gen)
78
Upvotes: 2
Reputation: 97631
What does Python do under the hood when it sees this kind of expression?
sum(sum(i) for j in array for i in j)
It becomes something equivalent to:
def __gen(it):
# "it" is actually in locals() as ".0"
for j in it:
for i in j:
yield sum(i)
sum(__gen(iter(arr)))
Note that both __iter__
and the name resolution happen outside the function scope. This only applies to the first for
loop
PEP 0289 explains this in more detail
Upvotes: 2