Reputation: 349
I have a custom dataclass, which is rather lage (many attributes, methods). Some attributes are pandas dataframes. The default __eq__ comparison does not work for the attribtues which are pandas dataframes. Hence, I started trying to write a custom __eq__ function to handle this. I came up with this, which seems to work:
def __eq__(self, other):
attribs = [a for a in dir(self) if (not a.startswith('_'))&(callable(self.__getattribute__(a))==False)]
for a in attribs:
if isinstance(self.__getattribute__(a),pd.DataFrame):
same = self.__getattribute__(a).equals(other.__getattribute__(a))
else:
same = (self.__getattribute__(a)==other.__getattribute__(a))
if not same:
break
return same
I'm testing by creating a class instance, saving to pickle and then reading that pickle file to a new variable.
However, it left me with two questions:
I had to add the limitation that no callables are compared, since when I compare the callables I get "false" even though the instances are the same (the error was not clear enough for me to understand why I get "False"). How does the default dataclass __eq__ actually handle this, are callables also ignored?
I assume there is a faster, vectorized way to check the attributes without having to use this for loop approach, but I don't see exactly how it would work. Any thoughts?
Upvotes: 0
Views: 28