Reputation: 7301
I have ID values of the type unsigned int
. I need to map an Id to a pointer in constant time.
Key Distribution:
ID will have a value in the range of 0 to uint_max. Most of keys will be clustered into a single group, but there will be outliers.
Implementation:
I thought about using the C++ ext hash_map stuff, but I've heard their performance isn't too great when keys have a huge potential range.
I've also thought of using some form of chained lookup (equivalent to recursively subdividing the range into C chucks). If there are no keys in a range, that range will point to NULL.
N = Key Range
Level 0 (divided into C = 16, so 16 pieces) = [0, N/16), [N/16, 2*(N/16)), ...
Level 1 (divided into C = 16, so 16 * 16 pieces) = ...
Does anyone else have ideas on how this mapping can be more efficiently implemented?
Update:
By constant, I just meant each key lookup is not significantly influenced by the # of values in the item. I did not mean it had to be a single op.
Upvotes: 1
Views: 336
Reputation: 224049
How many items are to be in such a map and how often is it changed?
If all values fit into the processor's cache, then a std::vector<std::pair<unsigned int,T*>>
with presorted values and binary search might be fastest despite the access being O(N).
Upvotes: 1
Reputation: 1675
If you want a tree-based solution and your ids are in the range {0..n-1} then you can use a very cool data structure called van Emde Boas tree. This will yield all operations in O(log log n) and use O(n) space.
Upvotes: 3
Reputation: 81
As GMan suggests an unordered_map is probably a good solution. If you are concerned about a large number of collisions in this hash map, then use a hash function that will remove the clustering of your data. For example, you could swap the bytes around.
A good point to note is that you will probably spend more time debugging and proving a custom data structure than one that's already got good pedigree.
Upvotes: 1
Reputation: 57648
Reserve 4GB of your RAM for this, and simply cast your uint to the pointer. That's definitely constant time.
Upvotes: 1
Reputation: 992887
If your integer values are 32 bits wide, then you could use a 64-bit platform, allocate 32 gigabytes of memory (8 bytes per 4 billion pointers), and use a flat array. That will be as close as you're going to get to constant lookup time.
Upvotes: 1
Reputation: 503805
Use a hash map (unordered_map
). This gives ~O(1) look-up times. You "heard" it was bad, but did you try it, test it, and determine it to be a problem? If not, use a hash map.
After your code gets close to completion, profile it and determine if the look-up times are the main cause of slowness in your program. Chances are, it won't be.
Upvotes: 11