TrilceAC
TrilceAC

Reputation: 323

Should __eq__ compare objects of two different types?

In the problem I'm working, there are data identifiers that have the form scope:name, being both scope and name strings. name has different parts separated by dots, like part1.part2.part3.part4.part5. On many occasions, but not always, scope is just equal to part1 of name. The code I'm writing has to work with different systems that provide or require the identifiers in different patterns. Sometimes they just require the full string representation like scope:name, on some other occasions calls have two different parameters scope and name. When receiving information from other systems, sometimes the full string scope:nameis returned, sometimes scope is omitted and should be inferred from name and sometimes a dict that contains scope and name is returned.

To ease the use of these identifiers, I have created a class to internally manage them, so that I don't have to write the same conversions, splits and formats over and over again. The class is quite simple. It only has two attributes (scope and name, a method to parse strings into objects of the class, and some magic methods to represent the objects Particularly, __str__(self) returns the object in the form scope:name, which is the fully qualified name (fqn) of the identifier:

class DID(object):
    """Represent a data identifier."""

    def __init__(self, scope, name):
        self.scope = scope
        self.name = name

    @classmethod
    def parse(cls, s, auto_scope=False):
        """Create a DID object given its string representation.

        Parameters
        ----------
        s : str
            The string, i.e. 'scope:name', or 'name' if auto_scope is True.

        auto_scope : bool, optional
            If True, and when no scope is provided, the scope will be set to
            the projectname. Default False.

        Returns
        -------
        DID
            The DID object that represents the given fully qualified name.

        """
        if isinstance(s, basestring):
            arr = s.split(':', 2)
        else:
            raise TypeError('string expected.')

        if len(arr) == 1:
            if auto_scope:
                return cls(s.split('.', 1)[0], s)
            else:
                raise ValueError(
                    "Expecting 'scope:name' when auto_scope is False"
                )
        elif len(arr) == 2:
            return cls(*arr)
        else:
            raise ValueError("Too many ':'")

    def __repr__(self):
        return "DID(scope='{0.scope}', name='{0.name}')".format(self)

    def __str__(self):
        return u'{0.scope}:{0.name}'.format(self)

As I said, the code has to perform comparisons with strings and use the string representation of some methods. I am tempted to write the __eq__ magic method and its counterpart __ne__. The following is an implementation of just __eq__:

    # APPROACH 1:
    def __eq__(self, other):
        if isinstance(other, self.__class__):
            return self.scope == other.scope and self.name == other.name
        elif isinstance(other, basestring):
            return str(self) == other
        else:
            return False

As you see, it defines the equality comparison between both DIDs and strings in a way that is possible to compare one with the other. My issue with this is whether it is a good practice:

On the one hand, when other is a string, the method casts self to be a string and I keep thinking on explicit better than implicit. You could end up thinking that you are working with two strings, which is not the case of self.

On the other hand, from the point of view of meaning, a DID represents the fqn scope:name and it makes sense to compare for equality with strings as it does when an int and a float are compared, or any two objects derived from basetring are compared.

I also have thought on not including the basestring case in the implementation, but to me this is even worse and prone to mistakes:

    # APPROACH 2:
    def __eq__(self, other):
        if isinstance(other, self.__class__):
            return self.scope == other.scope and self.name == other.name
        else:
            return False

In approach 2, a comparison for equality between a DID object and a string, both representing the same identifier, returns False. To me, this is even more prone to mistakes.

Which are the best practices in this situation? Should the comparison between a DID and a string be implemented as it is in approach 1, even though objects from different types might be considered equal? Should I use approach 2 even though s != DID.parse(s)? Should I not implement the __eq__ and __ne__ so that there are never misunderstoods?

Upvotes: 3

Views: 1389

Answers (2)

Simply Beautiful Art
Simply Beautiful Art

Reputation: 1749

This is something that was used in the other answer but only really explained in the comments. The best way to handle this is to explicitly handle cases your class is ready to handle and then to allow other classes to define how to handle your class if they want.

class MyClass:

    def __init__(self, x):
        self.x = x

    def __eq__(self, other):
        if isinstance(other, MyClass):
            return self.x == other.x
        else:
            return NotImplemented


class YourClass:

    def __init__(self, y):
        self.y = y

    def __eq__(self, other):
        if isinstance(other, YourClass):
            return self.y == other.y
        elif isinstance(other, MyClass):
            return self.y == other.x
        else:
            return NotImplemented

By returning NotImplemented, you can allow other classes to figure out how they should compare to your class if they need to. If both classes return NotImplemented i.e. they don't know how to handle each other, then == will return False.

Upvotes: 0

Sam Mason
Sam Mason

Reputation: 16184

A few classes in Python (but I can't think of anything in the standard library off the top of my head) define an equality operator that handles multiple types on the RHS. One common library that does support this is NumPy, with:

import numpy as np

np.array(1) == 1

evaluating to True. In general I think I'd discourage this sort of thing, as there are lots of corner cases where this behaviour can get tricky. E.g. see the write up in Python 3 __hash__ method (similar things exist in Python 2, but it's end-of-life). In cases where I have written similar code, I've tended to end up with something closer to:

def __eq__(self, other):
    if isinstance(other, str):
        try:
            other = self.parse(str)
        except ValueError:
            return NotImplemented

    if isinstance(other, DID):
        return self.scope == other.scope and self.name == other.name

    return NotImplemented

Further to this, I'd suggest making objects like this immutable and you have a few ways of doing this. Python 3 has nice dataclasses, but given that you seem to be stuck under Python 2, you might use namedtuples, something like:

from collections import namedtuple

class DID(namedtuple('DID', ('scope', 'name'))):
    __slots__ = ()

    @classmethod
    def parse(cls, s, auto_scope=False):
       return cls('foo', 'bar')

    def __eq__(self, other):
        if isinstance(other, str):
            try:
                other = self.parse(str)
            except ValueError:
                return NotImplemented

        return super(DID, self).__eq__(other)

which gives you immutability and a repr method for free, but you might want to keep your own str method. The __slots__ attribute means that accidentally assigning to obj.scopes will fail, but you might want to allow this behaviour.

Upvotes: 3

Related Questions