user365636
user365636

Reputation:

Using a hash table to create an unlimited array

I am currently developing a programming language in C, and I want to allow users to create apparently "unlimited" arrays with numerical indices without sacrificing performance in the process. For example, table [1000000000] would ideally be creatable and accessible in an instant without the memory overhead of a table of 1,000,000,000 items, 999,999,999 of which were unused; but the array would also perform well when table [n] was defined for, say, 1 ≤ n ≤ 1000000.

Do you have suggestions for the implementation of such an array-handling system?

Upvotes: 0

Views: 912

Answers (6)

Davi
Davi

Reputation: 497

Using cmph won't help. You need to know all keys in advance to create a (minimal) perfect hash function.

What you want is a simple associative mapping structure that will let you implement a sparse array. Any hash table or tree structure will do. You can use hash_map or map out of the box from your c++ stl implementation or any similar data structure.

If you want to go fancy, you can use Judy Arrays, but I will doubt it will make any difference unless you can properly benchmark stuff and are willing to consider more complex data structures that will do assumptions on your particular use case.

Do the simple thing. The easiest available hash table available for you is the best answer. Don't even bother thinking about hash functions or such, whatever your platform provides will work well enough.

Upvotes: 0

Spudd86
Spudd86

Reputation: 3006

There's Judy Arrays http://judy.sourceforge.net/

Upvotes: 1

the_void
the_void

Reputation: 5538

I think you've answered it yourself. Take a look at CMPH - C Minimal Perfect Hashing Library.

EDIT:

Or you could use a B+ Tree to map the integer to the real index in the array. Using B Trees has another advantage: you can make range queries.

Upvotes: 0

Favonius
Favonius

Reputation: 13984

Theoretically I think it is possible. What you need is very good hashing algorithm (to avoid collisions). So if anyone says table[100..0]; you need not to allocate the space at once. Allocate space on need basis. So if in the table[100..0] I am trying to populate first 5 values then I will be storing those five values only and if i try to access lets say table[100] then I should be getting something like 'undef' or 'nil' ....

the library mentioned by the_void seems good... though i haven't tested ... :)

Upvotes: 0

Hasturkun
Hasturkun

Reputation: 36412

You're creating a Sparse Array, as the Wikipedia article mentions, these can be represented by a linked list.

Each node in the linked list may be a dynamically allocated array so that you don't suffer excessive overhead for consecutive indices.

Upvotes: 0

vodkhang
vodkhang

Reputation: 18741

How about using pointer, you don't have to define number of elements for it, you can add as many elements as you want

Upvotes: 0

Related Questions