Reputation: 1889

max([x for x in something]) vs max(x for x in something): why is there a difference and what is it?

I was working on a project for class where my code wasn't producing the same results as the reference code.

I compared my code with the reference code line by line, they appeared almost exactly the same. Everything seemed to be logically equivalent. Eventually I began replacing lines and testing until I found the line that mattered.

Turned out it was something like this (EDIT: exact code is lower down):

# my version:
max_q = max([x for x in self.getQValues(state)])

# reference version which worked:
max_q = max(x for x in self.getQValues(state))

Now, this baffled me. I tried some experiments with the Python (2.7) interpreter, running tests using max on list comprehensions with and without the square brackets. Results seemed to be exactly the same.

Even by debugging via PyCharm I could find no reason why my version didn't produce the exact same result as the reference version. Up to this point I thought I had a pretty good handle on how list comprehensions worked (and how the max() function worked), but now I'm not so sure, because this is such a weird discrepancy.

What's going on here? Why does my code produce different results than the reference code (in 2.7)? How does passing in a comprehension without brackets differ from passing in a comprehension with brackets?

EDIT 2: the exact code was this:

# works
max_q = max(self.getQValue(nextState, action) for action in legal_actions)

# doesn't work (i.e., provides different results)
max_q = max([self.getQValue(nextState, action) for action in legal_actions])

I don't think this should be marked as duplicate -- yes, the other question regards the difference between comprehension objects and list objects, but not why max() would provide different results when given a 'some list built by X comprehension', rather than 'X comprehension' alone.

Upvotes: 15

Answers (3)

Eric

Reputation: 97631

Are you leaking a local variable which is affecting later code?

# works
action = 'something important'
max_q = max(self.getQValue(nextState, action) for action in legal_actions)
assert action == 'something important'

# doesn't work (i.e., provides different results)
max_q = max([self.getQValue(nextState, action) for action in legal_actions])
assert action == 'something important'  # fails!

Generator and dictionary comprehensions create a new scope, but before py3, list comprehensions do not, for backwards compatibility

Easy way to test - change your code to:

max_q = max([self.getQValue(nextState, action) for action in legal_actions])
max_q = max(self.getQValue(nextState, action) for action in legal_actions)

Assuming self.getQValue is pure, then the only lasting side effect of the first line will be to mess with local variables. If this breaks it, then that's the cause of your problem.

Upvotes: 17

Mikhail Gerasimov

Reputation: 39576

I don't know why you got different values in your project, but I can give you live example, when it happens. Generator is more effective then list, so we will have different memory usage. I'm using Python 3.

Here's function that returns current memory usage by Python:

import os
import psutil


def memory_usage():
    """Get process virtual memory (vms) usage in MB."""
    process = psutil.Process(os.getpid())
    memory = process.memory_info()[1] / (1024.0 * 1024.0)
    return memory

Try this code:

# Generator version:
max_q = max(memory_usage() for i in range(100000))
print(max_q)  # 7.03125

I tested code several times and I'm getting somethig over 7 on my machine.

Replace generator version with list version:

# List version:
max_q = max([memory_usage() for i in range(100000)])
print(max_q)  # 11.44921875

I'm getting something over 11 on my machine.

As you see code is almost same, but you will get different output.

May be in your project getQValue() gives you different values based on some already calculated. But that existing values can be removed by garbage collector faster if you use generator.

Upvotes: 2

Jared Mackey

Reputation: 4158

Use of the [] around a list comprehension actually generates a list into your variable, or in this case into your max function. Without the brackets you are creating a generator object that will be fed into the max function.

results1 = (x for x in range(10))
results2 = [x for x in range(10)]
result3 = max(x for x in range(10))
result4 = max([x for x in range(10)])
print(type(results1)) # <class 'generator'>
print(type(results2)) # <class 'list'>
print(result3) # 9
print(result4) # 9

As far as I know, they should work essentially the same within the max function.

Upvotes: 7

max([x for x in something]) vs max(x for x in something): why is there a difference and what is it?

Answers (3)

Related Questions