Reputation: 2658
I read about the wrapper function to move a cmp style comparison into a key style comparison in Python 3, where the cmp capability was removed.
I'm having a heck of a time wrapping my head around how a Python3 straight key style sorted() function, with, at least as I understand it, just one item specified for the key, can allow you to properly compare, for instance, two IPs for ordering. Or ham calls.
Whereas with cmp there was nothing to it: sorted() and sort() called you with the two ips, you looked at the appropriate portions, made your decisions, done.
def ipCompare(dqA,dqB):
...
ipList = sorted(ipList,cmp=ipCompare)
Same thing with ham radio calls. The sorting isn't alphabetic; the calls are generally letter(s)+number(s)+letter(s); the first sorting priority is the number portion, then the first letter(s), then the last letter(s.)
Using cmp... no sweat.
def callCompare(callA,callB):
...
hamlist = sorted(hamlist,cmp=callCompare)
With Python3... without going through the hoop jumping of the wrapper... and being passed one item... I think... how can that be done?
And if the wrapper is absolutely required... then why remove cmp within Python3 in the first place?
I'm sure I'm missing something. I just can't see it. :/
ok, now I know what I was missing. Solutions for IPs
were given in the answers below. Here's a key I came up with for sorting ham calls of the common prefix, region, postfix form:
import re
def callKey(txtCall):
l = re.split('([0-9]*)',txtCall.upper(),1)
return l[1],l[0],l[2]
hamList = ['N4EJI','W1AW','AA7AS','na1a']
sortedHamList = sorted(hamList,key=callKey)
sortedHamList
result is ['na1a','W1AW','N4EJI','AA7AS']
Detail:
AA7AS
comes out of callKey()
as 7,AA,AS
N4EJI
comes out of callKey()
as 4,N,EJI
W1AW
comes out of callKey()
as 1,W,AW
na1a
comes out of callKey()
as 1,NA,A
Upvotes: 3
Views: 2146
Reputation: 6476
According to official docs - https://docs.python.org/3/howto/sorting.html#the-old-way-using-the-cmp-parameter
When porting code from Python 2.x to 3.x, the situation can arise when you have the user supplying a comparison function and you need to convert that to a key function. The following wrapper makes that easy to do:
def cmp_to_key(mycmp):
'Convert a cmp= function into a key= function'
class K:
def __init__(self, obj, *args):
self.obj = obj
def __lt__(self, other):
return mycmp(self.obj, other.obj) < 0
def __gt__(self, other):
return mycmp(self.obj, other.obj) > 0
def __eq__(self, other):
return mycmp(self.obj, other.obj) == 0
def __le__(self, other):
return mycmp(self.obj, other.obj) <= 0
def __ge__(self, other):
return mycmp(self.obj, other.obj) >= 0
def __ne__(self, other):
return mycmp(self.obj, other.obj) != 0
return K
To convert to a key function, just wrap the old comparison function:
>>> def reverse_numeric(x, y):
... return y - x
>>> sorted([5, 2, 4, 1, 3], key=cmp_to_key(reverse_numeric))
[5, 4, 3, 2, 1]
Upvotes: 1
Reputation: 365697
First, if you haven't read the Sorting HOWTO, definitely read that; it explains a lot that may not be obvious at first.
For your first example, two IPv4 addresses, the answer is pretty simple.
To compare two addresses, one obvious thing to do is convert them both from dotted-four strings into tuples of 4 ints, then just compare the tuples:
def cmp_ip(ip1, ip2):
ip1 = map(int, ip1.split('.'))
ip2 = map(int, ip2.split('.'))
return cmp(ip1, ip2)
An even better thing to do is convert them to some kind of object that represents an IP address and has comparison operators. In 3.4+, the stdlib has such an object built in; let's pretend 2.7 did as well:
def cmp_ip(ip1, ip2):
return cmp(ipaddress.ip_address(ip1), ipaddress.ip_address(ip2))
It should be obvious that these are both even easier as key functions:
def key_ip(ip):
return map(int, ip.split('.'))
def key_ip(ip):
return ipaddress.ip_address(ip)
For your second example, ham radio callsigns: In order to write a cmp
function, you have to be able to break each ham address into the letters, numbers, letters portions, then compare the numbers, then compare the first letters, then compare the second letters. In order to write a key
function, you have to be able to break down a ham address into the letters, numbers, letters portions, then return a tuple of (numbers, first letters, second letters). Again the key function is actually easier, not harder.
And really, this is the case for most examples anyone was able to come up with. Most complicated comparisons ultimately come down to a complicated conversion into some sequence of parts, and then simple lexicographical comparison of that sequence.
That's why cmp
functions were deprecated way back in 2.4 and finally removed in 3.0.
Of course there are some cases where a cmp
function is easier to read—most of the examples people try to come up with turn out to be wrong, but there are some. And there's also code which has been working for 20 years and nobody wants to rethink it in new terms for no benefit. For those cases, you've got cmp_to_key
.
There's actually another reason cmp
was deprecated, on top of this one, and maybe a third.
In Python 2.3, types had a __cmp__
method, which was used for handling all of the operators. In 2.4, they grew the six methods __lt__
, __eq__
, etc. as a replacement. This allows for more flexibility—e.g., you can have types that aren't total-ordered. So, 2.3's when compared a < b
, it was actually doing a.__cmp__(b) < 0
, which maps in a pretty obvious way to a cmp
argument. But in 2.4+, a < b
does a.__lt__(b)
, which doesn't. This confused a lot of people over the years, and removing both __cmp__
and the cmp
argument to sort functions removed that confusion.
Meanwhile, if you read the Sorting HOWTO, you'll notice that before we had cmp
, the only way to do this kind of thing was decorate-sort-undecorate (DSU). Notice that it's blindly obvious how to map a good key
function to a good DSU sort and vice-versa, but it's definitely not obvious with a cmp
function. I don't remember anyone explicitly mentioning this one on the py3k list, but I suspect people may have had it in their heads when deciding whether to finally kill cmp
for good.
Upvotes: 6
Reputation: 61875
To use the new key
argument, simply decompose the comparison to another object that already implements a well-ordered comparison, such as to a tuple or list (eg. a sequence of integers). These types work well because they are sequence-wise ordered.
def ip_as_components (ip):
return map(int, ip.split('.'))
sorted_ips = sorted(ips, key=ip_as_components)
The ordering of the each of the components are the same as the individual tests as found in a traditional compare-and-then-compare-by function.
Looking at the HAM ordering it may look like:
def ham_to_components (ham_code):
# .. decompose components based on ordering of each
return (prefix_letters, numbers, postfix_letters)
The key
approach (similar to "order by" found in other languages) is generally a simpler and more natural construct to deal with - assuming that the original types are not already well-ordered. The main drawback with this approach is that partially reversed (eg. asc then desc) ordering can be tricky, but that is solvable by returning nested tuples etc.
In Py3.0, the cmp parameter was removed entirely (as part of a larger effort to simplify and unify the language, eliminating the conflict between rich comparisons and the
__cmp__()
magic method).
If absolutely needing sorted
with a custom "cmp", cmp_to_key
can be trivially used.
sorted_ips = sorted(ips, key=functools.cmp_to_key(ip_compare))
Upvotes: 4