JL Peyret
JL Peyret

Reputation: 12154

Complex sorting, easily done with cmp function, but how do I plan for Python 3?

I want to sort the columns returned from a database query for presentation. Within the results, I'd like to sort by:

  1. key fields first, ordered by position in the query's results (as this generally reflects the backend's unique index).

  2. the rest of the keys alphabetically because the position reflects the tables physical field order which is of no interest.

Note: this is not something that I want to do at the database level, it's a Python-sorting question.

I can do this as follows in Python 2.7 (see code below), but want to prepare for Python 3.

I have written new-style operator.attrgetter/itemgetter based sorts in the past, including successive passes, where you first sort by one key function, then another. But I can't see how 3's key function system will handle branching.

#test data, mangled on purpose
data = [
    dict(fieldname="anotherkey2", pos=1, key=True),
    dict(fieldname="somekey1", pos=0, key=True),
    dict(fieldname="bfield3", pos=2, key=False),
    dict(fieldname="afield", pos=3, key=False),
    dict(fieldname="cfield", pos=4, key=False),
]

#exp keys, first, by position, then non-keys, alphabetic order
exp = ["somekey1","anotherkey2","afield","bfield3","cfield"]

def cmp2(field1, field2):

    key1, key2 = field1.get("key"), field2.get("key")

    #if both are keys, go by position in cursor results
    if key1 and key2:
        return cmp(field1["pos"], field2["pos"])

    #if neither are keys, order alphabetically
    if not (key1 or key2):
        return cmp(field1["fieldname"], field2["fieldname"])

    #otherwise, keys go first
    return cmp(key2, key1)

for func in [cmp2]:
    test_data = data[:]
    test_data.sort(cmp=func)
    got = [field["fieldname"] for field in test_data]
    try:
        msg = "fail with function:%s exp:%s:<>:%s:got" % (func.__name__, exp, got)
        assert exp == got, msg
        print ("success with %s: %s" % (func.__name__, got))
    except AssertionError,e:
        print(e)

ouput:

success with cmp2: ['somekey1', 'anotherkey2', 'afield', 'bfield3', 'cfield']

Additionally, the cmp_to_key recipe in Sorting HOWTO looks scary and quite un-pythonic, with a lot of repeated code for each magic function. And I am unsure how functools.cmp_to_key is relevant.

I suppose what I could do is pre-decorate the field dictionaries with extra attribute that defines how to sort. Something like a sortby = (not key, pos if key else 0, fieldname) tuple, but hoping for a cleaner approach.

This works, but.... anything better?

def pre_compute(data):
    for row in data:
        key, pos, fieldname = row["key"], row["pos"], row["fieldname"]
        sortby = (not key, (pos if key else 0), fieldname)
        row["sortby"] = sortby

for func in [pre_compute]:
    test_data = data[:]

    func(test_data)

    test_data.sort(key=itemgetter('sortby'))

    got = [field["fieldname"] for field in test_data]
    try:
        msg = "fail with function:%s exp:%s:<>:%s:got" % (func.__name__, exp, got)
        assert exp == got, msg
        print ("success with %s: %s" % (func.__name__, got))
    except AssertionError,e:
        print(e)

Upvotes: 3

Views: 257

Answers (2)

user2357112
user2357112

Reputation: 280426

Sort by whether the field is a key field, then either the position or the field name depending on whether it's a key field.

def keyfunc(field):
    return (not field['key'], field['pos'] if field['key'] else field['fieldname'])

Upvotes: 1

jasonharper
jasonharper

Reputation: 9597

cmp_to_key() (either the standalone version, or the one built in to the functools module) turns an arbitrary function usable with sort's cmp= parameter into one usable with the newer key= parameter. It would be the most straightforward solution to your problem (although getting the database to do it for you might be better, as some commenters have pointed out).

Upvotes: 2

Related Questions