sgp667
sgp667

Reputation: 1875

Pythonic way of sorting classes with possible Nones in variables

I have a class that looks more or less like this:

class Something():
    def __init__(self,a=None,b=None):
        self.a = a
        self.b = b

I want to be able to sort it in a list, normally I'd just implement method like this:

def __lt__(self,other):
    return (self.a, self.b) < (other.a, other.b)

But this will raise an error in following case:

sort([Something(1,None),Something(1,1)])

While I want is for None values to be treated as greated than or following output:

[Something(1,1),Something(1,None)]

First thing that somes to my mind is change __lt__ to:

def __lt__(self,other):
    if self.a and other.a:
        if self.a != other.a:
            return self.a < other.a
    elif self.a is None:
        return True
    elif other.a is None:
        return False

    if self.b and other.b:
        if self.b != other.b:
            return self.b < other.b
    elif self.b is None:
        return True
    return False

This would give me the correct results but its just ugly and python usually has a simpler way, and I don't really want to do it for each variable that I use in sorting of my full class(omitted from here to make problem clearer).

So what is the pythonic way of solving this?

Note

I also tried following but I'm assuming that even better is possible:

This would:

def __lt__(self,other):
    sorting_attributes = ['a', 'b']
    for attribute in sorting_attributes:
        self_value = getattr(self,attribute)
        other_value = getattr(other,attribute)
        if self_value and other_value:
            if self_value != other_value:
                return self_value  < other_value
        elif self_value   is None:
            return True
        elif self_value  is None:
            return False

Really trying to internalize the Zen of Pyhton and I know that my code is ugly so how do I fix it?

Upvotes: 1

Views: 96

Answers (3)

ShadowRanger
ShadowRanger

Reputation: 155497

A completely different design I thought of later (posted separately because it's so different it should really be evaluated independently):

Map all your attributes to tuples, where the first element of every tuple is a bool based on the None-ness of the attribute, and the second is the attribute value itself. None/non-None mismatches would short-circuit on the bool representing None-ness preventing the TypeError, everything else would fall back to comparing the good types:

def __lt__(self, other):
    def _key(attr):
        # Use attr is not None to make None less than everything, is None for greater
        return (attr is None, attr)
    return (_key(self.a), _key(self.b)) < (_key(other.a), _key(other.b))

Probably slightly slower than my other solution in the case where no None/non-None pair occurs, but much simpler code. It also has the advantage of continuing to raise TypeErrors when mismatched types other than None/non-None arise, rather than potentially misbehaving. I'd definitely call this one my Pythonic solution, even if it is slightly slower in the common case.

Upvotes: 1

ShadowRanger
ShadowRanger

Reputation: 155497

A solution for the general case (where there may not be a convenient "bigger than any value" solution, and you don't want the code to grow more complex as the number of attributes increases), which still operates as fast as possible in the presumed common case of no None values. It does assume TypeError means None was involved, so if you're likely to have mismatched types besides None, this gets more complicated, but frankly, a class design like that is painful to contemplate. This works for any scenario with two or more keys (so attrgetter returns a tuple) and only requires changing the names used to construct the attrgetter to add or remove fields to compare.

def __lt__(self, other, _key=operator.attrgetter('a', 'b')):
    # Get the keys once for both inputs efficiently (avoids repeated lookup)
    sattrs = _key(self)
    oattrs = _key(other)
    try:
        return sattrs < oattrs  # Fast path for no Nones or only paired Nones
    except TypeError:
        for sattr, oattr in zip(sattrs, oattrs):
            # Only care if exactly one is None, because until then, must be equal, or TypeError
            # wouldn't occur as we would have short-circuited
            if (sattr is None) ^ (oattr is None):
                # Exactly one is None, so if it's the right side, self is lesser
                return oattr is None
        # TypeError implied we should see a mismatch, so assert this to be sure
        # we didn't have a non-None related type mismatch
        assert False, "TypeError raised, but no None/non-None pair seen

A useful feature of this design is that under no circumstances are rich comparisons invoked for any given attribute more than once; the failed attempt at the fast path proves that there must (assuming invariant of types being either compatible or None golds) be a run of zero or more attribute pairs with equal values, followed by a None/non-None mismatch. Since everything we care about is known equal or a None/non-None mismatch, we don't need to invoke potentially expensive rich comparisons again, we just do cheap identity testing to find the None/non-None mismatch and then return based on which side was None.

Upvotes: 0

blhsing
blhsing

Reputation: 106883

An easy way to do this is to convert None to infinity, i.e. float('inf'):

def __lt__(self, other):
    def convert(i):
        return float('inf') if i is None else i
    return [convert(i) for i in (self.a, self.b)] < [convert(i) for i in (other.a, other.b)]

Upvotes: 1

Related Questions