impulsionaudio
impulsionaudio

Reputation: 229

Fast container for consistent performance

I am looking a container to support frequent adds/removals. I have not idea how large the container may grow but I don't want to get stalls due to huge reallocations. I need a good balance between performance and consistent behavior.

Initially, I considered std::tr1::unordered_map but since I don't know the upper bound of the data set, collisions could really slow down the unordered_map's performance. It's not a matter of a good hashing function because no matter how good it is, if the occupancy of the map is more than half the bucket count, collisions will likely be a problem.

Now I'm considering std::map because it doesn't suffer from the issue of collisions but it only has log(n) performance.

Is there a way to intelligently handle collisions when you don't know the target size of an unordered_map? Any other ideas for handling this situation, which I imagine is not uncommon?

Thanks

Upvotes: 1

Views: 417

Answers (2)

zvrba
zvrba

Reputation: 24546

What kind of access do you need? Sequential, random access, lookup by key? Plus, you can rehash unordered map either manually (rehash method), and set its load factor. In any case the hash will rebuild itself when chains get too long (i.e., when the load factor is exceeded). Additionally, the slow-down point of a hash table is when it is full ~80%, not 50%.

You should really have read the documentation, for example here.

Upvotes: 0

Mike Dunlavey
Mike Dunlavey

Reputation: 40649

This is a run-time container, right?

Are you adding at the end (as in push_back) or in the front or the middle? Are you removing at random locations, or what?

How are you referencing information in it? Randomly, or from the front or back, or what?

If you need random access, something based on array or hash is preferred.

If reallocation is a big problem, you want something more like a tree or list.

Even so, if you are constantly new-ing (and delete-ing) the objects that you're putting in the container, that alone is likely to consume a large fraction of time, in which case you might find it makes sense to save used objects in junk lists, so you can recycle them.

My suggestion is, rather than agonize over the choice of container, just pick one, write the program, and then tune it, as in this example. No matter what you choose, you will probably want to change it, maybe more than once. What I found in that example was that any pre-existing container class is justifying its existence by ease of programming, not by fastest-possible speed.

I know it's counter-intuitive, but unless some other activity in your program ends up being the dominant cost, and you can't shrink it, your final burst of speed will require hand-coding the data structure.

Upvotes: 1

Related Questions