Reputation: 404
Lets say I want to create an inverted index on a document with 4 unique words in it.
It will look like word1 -> document, word2 -> document, word3 -> document, word4 -> document
.
Using a size limited ehcache cache along with a terracotta cluster I can put all four associations separately in the cache.
But here's what I'm wondering about: Would the cache maintain one copy of the document or would it store four of those? My guess is it'd be four serialised copies (which is undesirable for my case). If that's true, what's a better way to do this?
Upvotes: 1
Views: 63
Reputation: 14500
You are correct that any storage layer in Ehcache, with the exception of the in memory one will use a serialized version and thus your document will be duplicated effectively.
As suggested in a comment, you could add a level of indirection between the words and the document. You could also only store an ID in the cache and have the document leave elsewhere.
What is clear is that with direct mappings you should not rely on modifications done on the document of one mapping to be visible to the other mappings. You would be abusing the cache.
Upvotes: 0