Reputation: 6260
I have a class which I want to be able to compare for equality. The class is large (it contains a bitmap image) and I will be comparing it multiple times, so for efficiency I'm hashing the data and only doing a full equality check if the hashes match. Furthermore, I will only be comparing a small subset of my objects, so I'm only calculating the hash the first time an equality check is done, then using the stored value for subsequent calls.
class Foo
{
public:
Foo(int data) : fooData(data), notHashed(true) {}
private:
void calculateHash()
{
hash = 0; // Replace with hashing algorithm
notHashed = false;
}
int getHash()
{
if (notHashed) calculateHash();
return hash;
}
inline friend bool operator==(Foo& lhs, Foo& rhs)
{
if (lhs.getHash() == rhs.getHash())
{
return (lhs.fooData == rhs.fooData);
}
else return false;
}
int fooData;
int hash;
bool notHashed;
};
According to the guidance on this answer, the canonical form of the equality operator is:
inline bool operator==(const X& lhs, const X& rhs);
Furthermore, the following general advice for operator overloading is given:
Always stick to the operator’s well-known semantics.
My function must be able to mutate it's operands in order to perform the hashing, so I have had to make them non-const
. Are there any potential negative consequences of this (examples might be standard library functions or STL containers which will expect operator==
to have const
operands)?
Should a mutating operator==
function be considered contrary to its well-known semantics, if the mutation doesn't have any observable effects (because there's no way for the user to see the contents of the hash)?
If the answer to either of the above is "yes", then what would be a more appropriate approach?
Upvotes: 21
Views: 1624
Reputation: 7136
You can go the mutable route, but I'm not sure if that is needed. You can do a local cache when needed without having to use mutable. For example:
#include <iostream>
#include <functional> //for hash
using namespace std;
template<typename ReturnType>
class HashCompare{
public:
ReturnType getHash()const{
static bool isHashed = false;
static ReturnType cachedHashValue = ReturnType();
if(!isHashed){
isHashed = true;
cachedHashValue = calculate();
}
return cachedHashValue;
}
protected:
//derived class should implement this but use this.getHash()
virtual ReturnType calculate()const = 0;
};
class ReadOnlyString: public HashCompare<size_t>{
private:
const std::string& s;
public:
ReadOnlyString(const char * s):s(s){};
ReadOnlyString(const std::string& s): s(s){}
bool equals(const ReadOnlyString& str)const{
return getHash() == str.getHash();
}
protected:
size_t calculate()const{
std::cout << "in hash calculate " << endl;
std::hash<std::string> str_hash;
return str_hash(this->s);
}
};
bool operator==(const ReadOnlyString& lhs, const ReadOnlyString& rhs){ return lhs.equals(rhs); }
int main(){
ReadOnlyString str = "test";
ReadOnlyString str2 = "TEST";
cout << (str == str2) << endl;
cout << (str == str2) << endl;
}
Output:
in hash calculate
1
1
Can you give me a good reason to keep as to why keeping isHashed as a member variable is necessary instead of making it local to where its needed? Note that we can further get away from 'static' usage if we really want, all we have todo is make a dedicated structure/class
Upvotes: 0
Reputation: 41958
Yes, introducing semantically unexpected side-effects is always a bad idea. Apart from the other reasons mentioned: always assume that any code you write will forever only be used by other people who haven't even heard of your name, and then consider your design choices from that angle.
When someone using your code library finds out his application is slow, and tries to optimize it, he will waste ages trying to find the performance leak if it is inside an == overload, since he doesn't expect it, from a semantic point of view, to do more than a simple object comparison. Hiding potentially costly operations within semantically cheap operations is a bad form of code obfuscation.
Upvotes: 1
Reputation: 7232
Upvotes: 1
Reputation: 14496
You should never modify the object on comparison. However, this function does not logically modify the object. Simple solution: make hash
mutable, as computing the hash is a form of cashing. See:
Does the 'mutable' keyword have any purpose other than allowing the variable to be modified by a const function?
Upvotes: 3
Reputation: 56863
It seems like a perfectly valid usecase for a mutable
member. You can (and should) still make your operator==
take the parameters by const reference and give the class a mutable
member for the hash value.
Your class would then have a getter for the hash value that is itself marked as a const
method and that lazy-evaluates the hash value when called for the first time. It's actually a good example of why mutable
was added to the language as it does not change the object from a user's perspective, it's only an implementation detail for caching the value of a costly operation internally.
Upvotes: 37
Reputation: 145249
Use mutable
for the data that you want to cache but which does not affect the public interface.
U now, “mutate” → mutable
.
Then think in terms of logical const
-ness, what guarantees the object offers to the using code.
Upvotes: 12