Reputation: 14839
I am not understanding something right. I am under the impression that an unordered_set will not allow duplicate elements, based on their hash.
I have a struct, with a specialisation of std::hash, which appears to allow duplicates, although I have manually checked it
AddEdge ( const std::shared_ptr<Relation> relation, const std::shared_ptr<Concept> concept )
{
auto edge = std::make_shared<Edge>( (Edge){ relation, concept } );
auto res = _edges.insert ( edge );
return res.second;
}
An overloaded function does exactly the same but for reversed parameters
This is how struct Edge is hashed:
namespace std
{
template<> struct hash<shared_ptr<Edge>>
{
size_t operator()( const shared_ptr<Edge> & ptr ) const
{
size_t from = 0;
size_t to = 0;
if ( auto node = std::dynamic_pointer_cast<Concept>( ptr->from ) )
from = hash<shared_ptr<Concept>>()( node );
else if ( auto node = std::dynamic_pointer_cast<Relation>( ptr->from ) )
from = hash<shared_ptr<Relation>>()( node );
if ( auto node = std::dynamic_pointer_cast<Concept>( ptr->to ) )
to = hash<shared_ptr<Concept>>()( node );
else if ( auto node = std::dynamic_pointer_cast<Relation>( ptr->to ) )
to = hash<shared_ptr<Relation>>()( node );
return hash<size_t>()( from + to );
}
};
}
And the container held in:
std::unordered_set<std::shared_ptr<Edge>> _edges;
When I do:
graph2->AddEdge( sea_node, is_node );
graph2->AddEdge( is_node, blue_node );
I get:
Edge [sea,is] hash = 10017731961838884781
Edge [is,blue] hash = 11178184051384786808
I try a second time exactly the same, and I get the same hashes, yet, when I check the edges, I now have 4 edges instead of 2.
What am I doing wrong?
EDIT: class Concept & Relation have the same kind of hash function:
namespace std
{
template<> struct hash<shared_ptr<Concept>>
{
size_t operator()( const shared_ptr<Concept> & ptr ) const
{
return hash<string>()( ptr->asToken()->value() ) + hash<int>()( ptr->TokenIndex() ) + hash<string>()( "Concept" );
}
};
}
Even more interestignly, my output from when I add Edges, produces the same hash, and yet it the duplicate Edge is added.
Upvotes: 3
Views: 16106
Reputation: 385144
an unordered_set will not allow duplicate elements, based on their hash
No, unordered_set
avoids duplicates by comparing values, not the hashes of those values†.
The "values" of each of your shared pointers is going to differ because they refer to different objects.
You can actually change this behaviour by providing your own function as the KeyEqual
template parameter to unordered_set
:
template<
class Key,
class Hash = std::hash<Key>, // <-- you've been looking at this
class KeyEqual = std::equal_to<Key>, // <-- instead of this
class Allocator = std::allocator<Key>
> class unordered_set;
† If only one value with a given hash were allowed in an unordered_set
then (a) you'd be unable to add any values that genuinely resulted in hash collisions, and (b) the entire hash collision resolution mechanism would become entirely unnecessary.
Upvotes: 14