Reputation: 2673
I have to implement 4 Hash Tables. The size of hash tables are going to be very large ~ 1 GB Each. The hash values will be the address of accessing the data. Now, since the hash values will be randomly distributed, I have to insert and read data from the hash tables randomly. What is the best way to implement such a design? I cant use the arrays in RAM since it will cause overflow. Will I be benefited by using the mmap? Or any other design scheme.
Any help is highly appreciated.
Upvotes: 0
Views: 77
Reputation: 19774
Since you are saying that the hash tables can't fit into RAM
, I think of storing these hash tables
on the hard disk
in the form of files, where each value can be stored in a different line or in a delimiter
seperated fashion.
Whenever you want to update a value in the file, I would suggest you that instead of reading the whole file and updating a value which is rather slow, you can store some information about the exact location of each value at the beginning of the file, and just seek
to that position and update it. Regarding mmap
, you can still prefer it, because it performs demand paging
in bringing a file into the memory. But I guess this can't be faster than one-time seek and write
instead of every time bringing part of the file when a page-fault
occurs.
Upvotes: 0