Walker
Walker

Reputation: 1225

Custom Sort Complicated Strings in Python

I have a list of filenames conforming to the pattern: s[num][alpha1][alpha2].ext

I need to sort, first by the number, then by alpha1, then by alpha2. The last two aren't alphabetical, however, but rather should reflect a custom ordering.

I've created two lists representing the ordering for alpha1 and alpha2, like so:

alpha1Order = ["Fizz", "Buzz", "Ipsum", "Dolor", "Lorem"]
alpha2Order = ["Sit", "Amet", "Test"]

What's the best way to proceed? My first though was to tokenize (somehow) such that I split each filename into its component parts (s, num, alpha1, alpha2), then sort, but I wasn't quite sure how to perform such a complicated sort. Using a key function seemed clunky, as this sort didn't seem to lend itself to a simple ordering.

Upvotes: 1

Views: 109

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1122182

Once tokenized, your data is perfectly orderable with a key function. Just return the index of the alpha1Order and alpha2Order lists for the value. Replace them with dictionaries to make the lookup easier:

alpha1Order = {token: i for i, token in enumerate(alpha1Order)}
alpha2Order = {token: i for i, token in enumerate(alpha2Order)}

def keyfunction(filename):
    num, alpha1, alpha2 = tokenize(filename)
    return int(num), alpha1Order[alpha1], alpha2Order[alpha2]

This returns a tuple to sort on; Python will use the first value to sort on, ordering anything that has the same int(num) value by the second entry, using the 3rd to break any values tied on the first 2 entries.

Upvotes: 3

Related Questions