Daniele Bacarella
Daniele Bacarella

Reputation: 464

Strange behavior with nested list comprehension

I have the following piece of code:

[e for e in [sl] for sl in [1,[2,3],4,5]]

which I thought being equivalent (in terms of output) to:

[sl for sl in [1,[2,3],4,5]]

Yet, while the latter produces: [1,[2,3],4,5] the former returns: [5, 5, 5, 5]


I think it must have something do with how nested for-statements are evaluated.

I found a similar case here Weird behavior: Lambda inside list comprehension but since it uses an anonymous functions, the reason behind this behavior should be different.

Clearly, there's something I'm missing and I don't see.

Thank you for any clarification

UPDATE

As Patrick pointed out, the order of the two for is wrong and shouldn't run unless sl was defined before. I fooled myself here because I ran the examples in the interpreter and [sl for sl in [1,[2,3],4,5]] was executed first leaving sl set to the last value of the list in globals()


Now it would be great to understand how this is evaluated

[e for e in [sl] for sl in [1,[2,3],4,5]]

in order to produce [5, 5, 5, 5] in output.

Upvotes: 0

Views: 226

Answers (2)

Patrick Haugh
Patrick Haugh

Reputation: 60994

Is sl defined elsewhere in your code? Perhaps as 5? As written, your first example should not run, and does not run for me in Python 3.6. The correct way to write it would be

[e for sl in [1,[2,3],4,5] for e in [sl]]

Note that here sl is defined before it is used.

Edit:

Python reads list comprehensions left to right. When it gets to for e in [sl], it evaluates the expression [sl] based on what it already knows, without reading the rest of the line. You list comprehension is then something like

[e for e in [5] for sl in [1,[2,3],4,5]]

As there are four sl in [[1,[2,3],4,5]], you get 5 four times in the resultant list.

When writing list comprehensions, it's natural to write them from smallest to biggest

e for e in x for x in y for y in z #wrong

but you should actually write them from right to left, so that the interpreter recognizes the identifiers that you use in the nested comprehensions

e for y in z for x in y for e in x

This is no different from regular for loops:

for e in x:
    for x in y:
        for y in z:
            print(e)

is pretty obviously wrong, and list comprehensions are no different.

Upvotes: 2

Joe Iddon
Joe Iddon

Reputation: 20424

The only way that the code can run is if sl is defined elsewhere. If it is (as 5), then the code:

sl = 5

[e for e in [sl] for sl in [1,[2,3],4,5]]

produces the output of:

[5,5,5,5]

why?

The reason this is happening is that the for-loops evaluate from the left to the right. So the first thing is that happens is e is assigned to 5 - just as you could right:

[i for i in [9]]

which would give 9.

So now we know that regardless of the rightmost for-loop, the value of e will always be that of sl so in our case 5. Now, why is the output [5,5,5,5]? Well its confusing because the variable sl is being re-used. However, this does not effect the left of the list-comprehension as it evaluates left to right. So e will always have the value from [sl] (5) no matter what sl is on the right hand side. The right hand loop simply acts as a counter. Since there are 4 elements in it (1, [2,3], 4, 5), the left hand part is run 4 times. But e is always 5 so each of the times e is called, it is 5 - producing the result [5,5,5,5].

To demonstrate that the right hand side is simply a counter, the following will all produce the same result of [5,5,5,5]:

[e for e in [sl] for _  in [1, [2,3], 4, 5]]
[e for e in [sl] for sl in [0, 0, 0, 0]]
[e for e in [sl] for _  in range(4)]
[e for e in [sl] for sl in range(4)]

Upvotes: 0

Related Questions