ZtoYi
ZtoYi

Reputation: 192

Sort list by member variables correctly

I am trying to sort a list of objects by member variable. Going through stack overflow, I have found the following method. However, lsort Compares digit by digit, so that 5, 3, 7, 21, 64 will sort to 21, 3, 5, 64, 7 (I would like this to be numeric: 3, 5, 7, 21, 64). I am unsure of how to fix this, because some keys may look like D239, D97, D11 (lsort on this will look like D11, D239, D97; I would like it to look like D11, D97, D239). Though I prefer one method, I guess two would be ok.

import operator 

class foo:
    def __init__(self, key1, data1, data2):
        #all of these values are strings, even though some may be ints
        self.key = key1
        self.d1 = data1
        self.d2 = data2

#sorts list l by member variable search
def lsort (l, search):
    #this doesn't actually work very well.
    #key can be int or string
    #when key is an int, this seems to order by number of digits, then low to high
    #(e.g. 11, 12, 40, 99, 3, 6, 8)
    return sorted(l, key=operator.attrgetter(search))


l1 = [foo('12', 'foo1', None), foo('8', 'qwer', None), foo('7', 'foo3', None), foo('13', 'foo2', None), foo('77', 'foo4', None), foo('12', 'foo5', None) ]


for item in lsort(l1, 'key'):
    print item.key, item.d1, item.d2

OUTPUT:

12 foo1 None 
12 foo5 None 
13 foo2 None
7 foo3 None
77 foo4 None
8 qwer None

EXPECTED:

7 foo3 None
8 qwer None
12 foo1 None
12 foo5 None
13 foo2 None
77 foo4 None

Why is this happening? I use the same sort and run it on an extremely basic class and it seems to be working fine.

class foo:
    def __init__(self, d1):
        self.bar= d1

Please assist. Thanks.

Upvotes: 1

Views: 163

Answers (4)

Bi Rico
Bi Rico

Reputation: 25813

You want to make sure to compare keys as ints instead of strings, when you're using strings they're sorted alphabetically, ie '7' > '11'. The easiest way to do this is to define your own custom comparison methods for your foo class:

from functools import total_ordering

@total_ordering
class foo:
    def __init__(self, key1, data1, data2):
        #all of these values are strings, even though some may be ints
        self.key = key1
        self.d1 = data1
        self.d2 = data2

    @staticmethod
    def _as_int(value):
        try:
            return int(value)
        except ValueError:
            return value

    def __le__(self, other):
        return self._as_int(self.key) < self._as_int(other.key)
    def __eq__(self, other):
        return self._as_int(self.key) == self._as_int(other.key)

l1 = [foo('12', 'foo1', None),
      foo('8', 'qwer', None),
      foo('7', 'foo3', None),
      foo('13', 'foo2', None),
      foo('77', 'foo4', None),
      foo('12', 'foo5', None),
      foo('A', 'foo', None),
      foo('B', 'foo', None)]

for item in sorted(l1):
    print item.key, item.d1, item.d2

Which gives:

7 foo3 None
8 qwer None
12 foo1 None
12 foo5 None
13 foo2 None
77 foo4 None
A foo None
B foo None

If you know for sure that the key attribute will be numeric, you can simplify the code a little bit.

Upvotes: 1

btilly
btilly

Reputation: 46399

Ah, yes. The old, "Just put it in a natural order!" problem.

Translating an old hack that I got in Perl from Tye McQueen, something like this should work for strings:

import re

def replace_match(match):
    value = match.group(0)
    if value[0] == ".":
        return value
    else:
        return ("0"*(9-len(value))) + value

def replace_with_natural(string):
    return re.sub("(\.\d*|[1-9]\d{0,8})", replace_match, string)

items = ["hello1", "hello12", "foo12.1", "foo12.05", "hello3", "foo.12.12"]
print(sorted(items, key=replace_with_natural))

The idea is that we replace every number in the string with a number of fixed length that sorts lexicographically in the way we like.

Note that ANY function like this will run into stuff it handles poorly. In this case, scientific notation is handled poorly. But this will do what people expect with embedded numbers 99.99% of the time.

Upvotes: 1

BlivetWidget
BlivetWidget

Reputation: 11063

You're sorting strings. The string '12' comes before '2', for example. Cast them as numbers if you want to sort numerically.

Upvotes: 0

Related Questions