Abhishek
Abhishek

Reputation: 2673

Frequent random access of large arrays

I have to implement 4 Hash Tables. The size of hash tables are going to be very large ~ 1 GB Each. The hash values will be the address of accessing the data. Now, since the hash values will be randomly distributed, I have to insert and read data from the hash tables randomly. What is the best way to implement such a design? I cant use the arrays in RAM since it will cause overflow. Will I be benefited by using the mmap? Or any other design scheme.

Any help is highly appreciated.

Upvotes: 0

Views: 77

Answers (2)

nitish712
nitish712

Reputation: 19774

Since you are saying that the hash tables can't fit into RAM, I think of storing these hash tables on the hard disk in the form of files, where each value can be stored in a different line or in a delimiter seperated fashion.

Whenever you want to update a value in the file, I would suggest you that instead of reading the whole file and updating a value which is rather slow, you can store some information about the exact location of each value at the beginning of the file, and just seek to that position and update it. Regarding mmap, you can still prefer it, because it performs demand paging in bringing a file into the memory. But I guess this can't be faster than one-time seek and write instead of every time bringing part of the file when a page-fault occurs.

Upvotes: 0

Fred Foo
Fred Foo

Reputation: 363797

Berkeley DB supports on-disk hash tables.

Upvotes: 1

Related Questions