Reputation: 323
In the problem I'm working, there are data identifiers that have the form scope:name
, being both scope
and name
strings. name
has different parts separated by dots, like part1.part2.part3.part4.part5
. On many occasions, but not always, scope
is just equal to part1
of name
. The code I'm writing has to work with different systems that provide or require the identifiers in different patterns. Sometimes they just require the full string representation like scope:name
, on some other occasions calls have two different parameters scope
and name
. When receiving information from other systems, sometimes the full string scope:name
is returned, sometimes scope
is omitted and should be inferred from name
and sometimes a dict that contains scope
and name
is returned.
To ease the use of these identifiers, I have created a class to internally manage them, so that I don't have to write the same conversions, splits and formats over and over again. The class is quite simple. It only has two attributes (scope
and name
, a method to parse strings into objects of the class, and some magic methods to represent the objects Particularly, __str__(self)
returns the object in the form scope:name
, which is the fully qualified name (fqn) of the identifier:
class DID(object):
"""Represent a data identifier."""
def __init__(self, scope, name):
self.scope = scope
self.name = name
@classmethod
def parse(cls, s, auto_scope=False):
"""Create a DID object given its string representation.
Parameters
----------
s : str
The string, i.e. 'scope:name', or 'name' if auto_scope is True.
auto_scope : bool, optional
If True, and when no scope is provided, the scope will be set to
the projectname. Default False.
Returns
-------
DID
The DID object that represents the given fully qualified name.
"""
if isinstance(s, basestring):
arr = s.split(':', 2)
else:
raise TypeError('string expected.')
if len(arr) == 1:
if auto_scope:
return cls(s.split('.', 1)[0], s)
else:
raise ValueError(
"Expecting 'scope:name' when auto_scope is False"
)
elif len(arr) == 2:
return cls(*arr)
else:
raise ValueError("Too many ':'")
def __repr__(self):
return "DID(scope='{0.scope}', name='{0.name}')".format(self)
def __str__(self):
return u'{0.scope}:{0.name}'.format(self)
As I said, the code has to perform comparisons with strings and use the string representation of some methods. I am tempted to write the __eq__
magic method and its counterpart __ne__
. The following is an implementation of just __eq__
:
# APPROACH 1:
def __eq__(self, other):
if isinstance(other, self.__class__):
return self.scope == other.scope and self.name == other.name
elif isinstance(other, basestring):
return str(self) == other
else:
return False
As you see, it defines the equality comparison between both DIDs and strings in a way that is possible to compare one with the other. My issue with this is whether it is a good practice:
On the one hand, when other
is a string, the method casts self
to be a string and I keep thinking on explicit better than implicit. You could end up thinking that you are working with two strings, which is not the case of self.
On the other hand, from the point of view of meaning, a DID
represents the fqn scope:name
and it makes sense to compare for equality with strings as it does when an int and a float are compared, or any two objects derived from basetring
are compared.
I also have thought on not including the basestring case in the implementation, but to me this is even worse and prone to mistakes:
# APPROACH 2:
def __eq__(self, other):
if isinstance(other, self.__class__):
return self.scope == other.scope and self.name == other.name
else:
return False
In approach 2, a comparison for equality between a DID object and a string, both representing the same identifier, returns False
. To me, this is even more prone to mistakes.
Which are the best practices in this situation? Should the comparison between a DID and a string be implemented as it is in approach 1, even though objects from different types might be considered equal? Should I use approach 2 even though s != DID.parse(s)
? Should I not implement the __eq__
and __ne__
so that there are never misunderstoods?
Upvotes: 3
Views: 1389
Reputation: 1749
This is something that was used in the other answer but only really explained in the comments. The best way to handle this is to explicitly handle cases your class is ready to handle and then to allow other classes to define how to handle your class if they want.
class MyClass:
def __init__(self, x):
self.x = x
def __eq__(self, other):
if isinstance(other, MyClass):
return self.x == other.x
else:
return NotImplemented
class YourClass:
def __init__(self, y):
self.y = y
def __eq__(self, other):
if isinstance(other, YourClass):
return self.y == other.y
elif isinstance(other, MyClass):
return self.y == other.x
else:
return NotImplemented
By returning NotImplemented
, you can allow other classes to figure out how they should compare to your class if they need to. If both classes return NotImplemented
i.e. they don't know how to handle each other, then ==
will return False
.
Upvotes: 0
Reputation: 16184
A few classes in Python (but I can't think of anything in the standard library off the top of my head) define an equality operator that handles multiple types on the RHS. One common library that does support this is NumPy, with:
import numpy as np
np.array(1) == 1
evaluating to True
. In general I think I'd discourage this sort of thing, as there are lots of corner cases where this behaviour can get tricky. E.g. see the write up in Python 3 __hash__
method (similar things exist in Python 2, but it's end-of-life). In cases where I have written similar code, I've tended to end up with something closer to:
def __eq__(self, other):
if isinstance(other, str):
try:
other = self.parse(str)
except ValueError:
return NotImplemented
if isinstance(other, DID):
return self.scope == other.scope and self.name == other.name
return NotImplemented
Further to this, I'd suggest making objects like this immutable and you have a few ways of doing this. Python 3 has nice dataclasses, but given that you seem to be stuck under Python 2, you might use namedtuple
s, something like:
from collections import namedtuple
class DID(namedtuple('DID', ('scope', 'name'))):
__slots__ = ()
@classmethod
def parse(cls, s, auto_scope=False):
return cls('foo', 'bar')
def __eq__(self, other):
if isinstance(other, str):
try:
other = self.parse(str)
except ValueError:
return NotImplemented
return super(DID, self).__eq__(other)
which gives you immutability and a repr method for free, but you might want to keep your own str method. The __slots__
attribute means that accidentally assigning to obj.scopes
will fail, but you might want to allow this behaviour.
Upvotes: 3