Reputation: 21032

Are dicts always the most efficient way to store hashes in python?

In many languages there are numerous different implementations of the hash map/associative array, each suited to different tasks. In Python as far as I know there's just dict.

So I guess what I'm wondering is, are there any applications for which a customized data structure makes sense? Or will it always be slower than using one of Python's built in data structures?

Upvotes: 3

Answers (1)

Sylvain Leroux

Reputation: 52070

This should probably be in comment, but far too long so...

Python dictionary are implemented as as hash table. Hash tables are good to retrieve items on average ( O(1) ). If they are generally good, their performances decrease when there are a lot of collisions. A collision is when the computed hash for two different keys is the same. So at worst, that is if all the keys have the same hash, their complexity for the search operation is in O(n) -- that is the equivalent of a linear search.

But associative arrays could be implemented by using several other data structures. Notably binary search trees (not that good: O(log n) on average, O(n) at worst) or B-tree (better: O(log n) on average and at worst). AFAIK the standard Python library does not provide any of those. By googling a little, it seems to have various implementation hanging around.

That leads to the core of your question: "are there any applications for which a customized data structure makes sense? Or will it always be slower than using one of Python's built in data structures?". As of myself, I think any pure-Python implementation of any of these data structures will probably be slowly that native Python dictionary. If you have very special needs, you might consider implementing those at C-level. But if you have such constraints, maybe Python wasn't the best choice at first.

Upvotes: 1

Are dicts always the most efficient way to store hashes in python?

Answers (1)

Related Questions