Reputation: 114518
The max
and min
functions evaluate the key
argument exactly once per element, which I infer from the documentation of list.sort
that they refer to (as well as an educated guess about their implementation):
The key corresponding to each item in the list is calculated once and then used for the entire sorting process.
This means that it should be safe to use a key function that does not always return the same output for a given input. But is it possible to retrieve the key of the max or min elegantly without a custom function or calling the key function again?
For a non-deterministic key, the following would not work:
max_val = max(iterable, key=key)
max_key = key(max_val)
The same problem occurs with
max_val = sorted(iterable, key=key)[0]
A custom function could be written like this:
from itertools import tee
def max_and_key(iterable, *, key=None):
i1, i2 = tee(iterable)
max_val = max(k, -i, v for i, (k, v) in enumerate(zip(map(key, i1), i2)))
return max_val[2], max_val[0]
The tee
is necessary to make this work on arbitrary iterables, where the elements of the zip
have to work on the same element of the iterable without interfering with each other. The zip
ensures that the tee
does not have to store more than one element at a time, for maximum laziness in evaluation. Enumeration ensures that for cases where the keys are the same but the values are different, the stability of the comparison is preserved in a manner consistent with the original functions:
If multiple items are maximal [minimal], the function returns the first one encountered.
Note the minus sign in the expression being maximized.
All in all, this function seems like massive overkill to retrieve something that is being computed already. Is there a better solution for this?
If there is no other way, at least this function has the same algorithmic complexity and general contract as max
.
Tangent/bonus question: what is the adjective meaning "not returning the same result for the same inputs every time"? Non-deterministic is only a small subset of possibilities, and non-reentrant means something subtly different to my understanding.
Upvotes: 3
Views: 183
Reputation: 42139
I believe this should also work:
max(((key(x),x) for x in iterable),key=lambda kx:kx[0])
Upvotes: 0
Reputation: 15903
For this you'll need to precompute the keys. It probably makes most sense to put the key/values in a tuple. However, you'll want to take care that min
/max
/sort
only performs comparison on the key and not the value (otherwise if the value isn't comparable this will fail if there are duplicate keys):
from operator import itemgetter
def max_with_key(iterable, key):
"""
Returns a (max_key, max_value) tuple by applying max to the iterable with
the given key. Useful in cases when the key function is non-deterministic
and the original key used in the max operation is desired.
>>> from random import randint
>>> max_with_key([1, 2, 3], key=lambda _: randint(0, 10))
(9, 3)
>>> max_with_key([1, 2, 3], key=lambda _: randint(0, 10))
(8, 1)
"""
prekeyed = ((key(x), x) for x in iterable)
return max(prekeyed, key=itemgetter(0))
Upvotes: 4
Reputation: 363384
What about using tuples lexicographical orderings:
max_key, max_val = max((key(val), val) for val in iterable)
If the values are not comparable, suggestion from comments:
max_key, _, max_val = max((key(val), -i, val) for i, val in enumerate(iterable))
If the result of the keyfunc is hashable:
d = {key(x): x for x in iterable} # note: last value wins for ties
max_key = max(d)
max_val = d[max_key]
Upvotes: 2