Reputation: 2158
I am unfamiliar with tools in C++
. Please pardon me if this question sounds dump.
From the doc of map.find()
, the complexity is O(log(N))
. This seems to suggest the tree structure in the implementation.
From the doc of unordered_map.find()
, the average-complexity is constant while worst-case is O(N)
. This seems like a hash-table.
I am looking for a kind of map that allows me to:
unordered_map
satisfies (1) with unordered_map.rehash
, but unfound queries may take a long time. map
seems to have better performance for unfound queries, but without the pre-allocate-mem feature. Any suggestion?
Upvotes: 2
Views: 83
Reputation: 308111
A Bloom Filter is good for situations where you expect a lot of misses. It's kind of like a hash table, except that it uses multiple hashes and doesn't actually store the items in the table, just a bit to tell you if the item is not part of the collection. It will tell you very quickly if there's a miss, then you need a secondary method (such as the one suggested by Jerry Coffin) to do a second lookup if filter indicates a possible match.
Upvotes: 1
Reputation: 490048
Having a single, fixed number of items tends to imply that you're going to insert some specific items, then leave them in the collection until you're done with it.
If that's the case, I'd probably put the items into an std::vector
and sort them. Then, if the distribution is reasonably predictable, use an interpolating search, otherwise a binary search.
As long as you don't have to insert/delete more items and retain the order, this is typically quite a bit faster than a tree, even if you use a binary search, simply because the data is contiguous.
Given that you expect quite a few misses in your searches, I'd consider a hash table (unordered_map) that you set to an extremely low load factor, so that the vast majority of the time, you'll hash the key, and if it's not present chances are extremely good that you'll land on an empty hash bucket, so you'll get an indication that the search has failed very quickly. You can set the load factor with max_load_factor()
.
Upvotes: 3