Reputation: 553
I am trying to write a search-engine for a large collection, for learning purposes. I started with my own intuitions. Then I researched and am finally arriving at a working model.
I am constructing a giant hash-table to hold all the terms in my collection. It is very expensive to construct this from the collection. Once I have computed the table I want to save this to disk, so that whenever I want to access this hash-table in my program latter, I can load it again from disk.
Is there any standard way of doing it or do I have to invent my own file-format and hacks to do this?
Note: The has-table is only for storing all term occurrences, I am planning to store the main ranking data in a postings file and have its pointer set in corresponding term of hash-table.
I am working in C.
Upvotes: 2
Views: 679