lnshi
lnshi

Reputation: 2868

mysterious behaviour of python built-in method filter in for loop

Consider the below fact:

a = list(range(10))

res = list(a)
for i in a:
  if i in {3, 5}:
    print('>>>', i)
    res = filter(lambda x: x != i, res)

print(list(res))

>>> 3
>>> 5
[0, 1, 2, 3, 4, 5, 6, 7, 8]

So neither 3 nor 5 was removed, but 9 is gone...

If i force convert the filter object to list, then it work as expected:

a = list(range(10))

res = list(a)
for i in a:
  if i in {3, 5}:
    print('>>>', i)
    # Here i force to convert filter object to list then it will work as expected.
    res = list(filter(lambda x: x != i, res))

print(list(res))

>>> 3
>>> 5
[0, 1, 2, 4, 6, 7, 8, 9]

I can feel this is due to that the filter object is a generator, but cannot exactly interpreter how the generator cause this consistent weird behaviour, please help to elaborate the underlying rationalities.

Upvotes: 0

Views: 60

Answers (1)

alani
alani

Reputation: 13079

The behaviour arises from a combination of two facts:

  1. The lambda function contains the variable i taken from the surrounding scope, which is only evaluated at execution time. Consider this example:
>>> func = lambda x: x != i  # i does not even need to exist yet
>>> i = 3
>>> func(3)  # now i will be used
False
  1. Because filter returns a generator, the function is evaluated lazily, when you actually iterate over it, rather than when filter is called.

The combined effect of these, in the first example, is that by the time that you iterate over the filter object, i has the value of 9, and this value is used in the lambda function.

The desired behaviour can be obtained by removing either (or both) of the two combined factors mentioned above:

  1. In the lambda, force early binding by creating a closure, where you use the value of i as the default value of a parameter (say j), so in place of lambda x: x != i, you would use:
lambda x, j=i: x != j
  • The expression for the default value (i.e. i) is evaluated when the lambda is defined, and by calling the lambda with only one argument (x) this ensures that you do not override this default at execution time.

or:

  1. Force early execution of all iterations of the generator by converting to list immediately (as you have observed).

Upvotes: 1

Related Questions