Tim Bill
Tim Bill

Reputation: 23

How to compare the already existing file with new file haskell

I am implementing inverted index in haskell. I have already made the case where my code would create the inverted index of every word in the list of documents that i provide. and then write this index to a file called inv.txt which would be something like:

("this",[d1,d2,d5])

("is",[d3,d4,d16])

("hello",[d1])

However, I want to add an additional case, where if I add a new document to my folder, and call the function, the "inv.txt" gets updated according to the newly added document. so it now becomes something like

("this",[d1,d2,d5,new Doc])

("is",[d3,d4,d16])

("hello",[d1])

("get",[new Doc])

but I can not think of an approach to go about it. is this somehow possible in haskell? (without re writing the whole file, like using seekg or peekg functions or something)?

Upvotes: 0

Views: 46

Answers (1)

ErikR
ErikR

Reputation: 52039

Since your primary use case is adding a document to the index, how about this:

  1. Your index file consists of text lines, one line per key.
  2. When reading the index you insert the key-value pair to your Map (or whatever you are using as your inverted index) if it the first time you have seen the key. Otherwise, if the key already exists you append/prepend the value to the existing value in the Map.
  3. To add a new document, simply seek to the end of the file and write out the pairs for just the new document.

For example, suppose your index file contains the lines:

("this",[d1,d2,d5])
...
("this", [d6])

When you encounter the first "this" pair you would create the pair in your Map, and when you encountered the second "this" pair you would append/prepend d6 to the current list associated with the key.

Upvotes: 1

Related Questions