Darius
Darius

Reputation: 5259

C++ (Hashmap style) Data Structure Ideal For This Scenario?

People have asked similar questions about the efficiency of various data structures but none I have read are totally applicable to my scenario so I wondered if people had suggestions for one that was tailored to satisfy the following criteria efficiently:

I am developing for a company that will use the program commercially so any third-party data structures should come with no copyright protection or anything, but if the STL has a data structure that will do the job efficiently then that would be perfect.

I know there are countless Hashmap/Dictionary style C++ data structures with implementations that are built to satisfy different criteria so if someone can suggest one ideal for this situation then that would be greatly appreciated.

Many thanks

Edit:

I found this passage on SO that seems to suggest unordered_map would be good?

hash_map and unordered_map are generally implemented with hash tables. Thus the order is not maintained. unordered_map insert/delete/query will be O(1) (constant time) where map will be O(log n) where n is the number of items in the data structure. So unordered_map is faster, and if you don't care about the order of the items should be preferred over map. Sometimes you want to maintain order (ordered by the key) and for that map would be the choice.

Upvotes: 2

Views: 737

Answers (5)

b.buchhold
b.buchhold

Reputation: 3896

As for build-in solutions I'd recommand google::dense_hash_map. They are really fast especially for numeric keys. You'll have to decide on a specific key that will be reserved as "empty_key". Moreover here is a really nice comparison of different hash-map implementations.

An excerpt

Library         Linux-intCPU (sec)  Linux-strCPU (sec)   Linux PeakMem (MB)
glib            3.490               4.720                24.968
ghthash         3.260               3.460                61.232
CC’s hashtable  3.040               4.050                129.020
TR1             1.750               3.300                28.648
STL hash_set    2.070               3.430                25.764
google-sparse   2.560               6.930                5.42/8.54
google-dense    0.550               2.820                24.7/49.3
khash (C++)     1.100               2.900                6.88/13.1
khash (C)       1.140               2.940                6.91/13.1
STL set (RB)    7.840               18.620               29.388
kbtree (C)      4.260               17.620               4.86/9.59
NP’s splaytree  11.180              27.610               19.024

However, when setting a "deleted_key", this map can also perform deletions. So maybe it'll be possible to create a custom solution that is even more efficient. But apart from that minor point, any hash-map should exactly suit your needs (note that "map" is an ordered tree-map and thus slower).

Upvotes: 2

Puppy
Puppy

Reputation: 146910

What you're looking for is an unordered_set. You can find one in Boost, TR1, or C++0x. If you're hoping to associate the key with a value, then unordered_map does just that- also in Boost/TR1/C++0x.

Upvotes: 0

LeleDumbo
LeleDumbo

Reputation: 9340

Looks like a prefix tree (with element at each node end) also fits in this scenario. It's damn fast, even faster than hash map because no hash value calculation is done and getting a value is purely O(n) where n is the key length. It's a bit memory hungry but common prefix of keys are shared in the same node path.

EDIT: I assume the keys are string, not simple values like integers

Upvotes: 2

James Kanze
James Kanze

Reputation: 153909

It sounds like std::unordered_set would fit the bill, but without knowing more about the key, it's difficult to say. I'm curious about how you can guarantee that there will be no possibility of collisions: this implies a small (less than the size of the table), finite set of keys. If this is the case, it may be more efficient to map the keys to a small int, and use std::vector (with empty slots for the entries not present).

Upvotes: 0

TC1
TC1

Reputation: 1

What you need definitely sounds like a hash set, C++ has this as either std::tr1::unordered_set or in Boost.Unordered.

P.S. Note, however, that TR1 is not yet standard, and you'll probably need to get Boost for the implementation.

Upvotes: 1

Related Questions