Cmag
Cmag

Reputation: 15770

python sort list

short of renaming/fixing the logging module on the webservers... when i do a list.sort(), the list entries get placed in the following order:

2011-09-21 19:15:54,731 DEBUG __main__ 44: running www.site.com-110731.log.0.gz
2011-09-21 19:15:54,731 DEBUG __main__ 44: running www.site.com-110731.log.1.gz
2011-09-21 19:15:54,731 DEBUG __main__ 44: running www.site.com-110731.log.2.gz
2011-09-21 19:15:54,732 DEBUG __main__ 44: running www.site.com-110731.log.3.gz
2011-09-21 19:15:54,732 DEBUG __main__ 44: running www.site.com-110731.log.gz

how would i sort a list, to get (ie the entry eithout a digit to be first):

2011-09-21 19:15:54,732 DEBUG __main__ 44: running www.site.com-110731.log.gz
2011-09-21 19:15:54,731 DEBUG __main__ 44: running www.site.com-110731.log.0.gz
2011-09-21 19:15:54,731 DEBUG __main__ 44: running www.site.com-110731.log.1.gz
2011-09-21 19:15:54,731 DEBUG __main__ 44: running www.site.com-110731.log.2.gz
2011-09-21 19:15:54,732 DEBUG __main__ 44: running www.site.com-110731.log.3.gz

THANKS!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Upvotes: 0

Views: 388

Answers (1)

Zach Snow
Zach Snow

Reputation: 1054

You would probably want to write a custom comparator to pass to sort; in fact, you probably need to anyway, because you're likely getting a lexicographical sort order instead of the intended (I presume) numerical order.

For instance, if you know that the filenames will only differ in those digits, you'd write a comparator that extracts those digits, converts them to int, and then compares based on that value.

Taking your examples as canonical, your comparator might look something like this:

import re
def extract(s):
    r = re.compile(r'\.(\d+)\.log\.((\d*)\.)?gz')
    m = r.search(s)
    file = int(m.group(1))
    if not m.group(2):
        return (file, -1)
    index = int(m.group(3))
    return (file, index)

def comparator(s1, s2): return cmp(extract(s1), extract(s2))

This prefers to sort based on the "file" number (the first one), and then by the "index" number (the second one). Note that it takes advantage of the fact that using cmp on tuples works as we require.

Upvotes: 4

Related Questions