Reputation: 4509
Here I need to cache some entites, for example, a Page Tree in a content management system (CMS). The system allows developers to write plugins, in which they can access the cached page tree. Is it good or bad to make the cached page tree mutable (i.e., there are setters for the tree node objects, and/or we expose the Add, Remove method in the ChildPages collection. So the client code can set properties of the page tree nodes, and add/remove tree nodes freely)?
Here's my opinions:
(1) If the page tree is immutable, the plugin developers has no way to modify the tree unexpected. That way we can avoid some subtle bugs.
(2) But sometimes we need to change the name of a page. If the page tree is immutable, we should invoke some method like "Refresh()" to refresh the cache. This will cause a database hit(so totally two database hits, but we should have avoided 1 of the 2 database hit). In this case, if the page tree is mutable, we can directly change the name in the page tree to make the tree up to date (so only 1 database hit is needed).
What do you think about it? And what will you do if you encounter such a situation?
Thanks in advance! :)
UPDATE: The page tree is something like:
public class PageCacheItem {
public string Name { get; set; }
public string PageTitle { get; set; }
public PageCacheItemCollection Children { get; private set; }
}
My problem here is not about the hashcode, because the PageCacheItem won't be put on a hashset or dictionary as keys.
My prolbem is:
If the PageCacheItem (the tree node) is mutable, that is, there are setters for properties(e.g., has setter for Name, PageTitle property). If some plugin authors change the properties of the PageCacheItem by mistake, the system will be in a incorrect state (that cached data is not consistent with the data in the database), and this bug is hard to debug, because it's caused by some plugin, not the system itself.
But if the PageCacheItem is readonly, it might be hard to implement efficient "cache refresh" functionality, because there are no setters for the properties, we can't simply update the properties by setting them to the latest values.
UPDATE2
Thanks guys. But I have one thing to note, that is, I'm not going to develop a generic caching framework, but develop some APIs on top of an exsiting caching framework. So my APIs is a middle layer between the underlying caching framework and the plugin authors. The plugin author doesn't need to know anything about the underlying caching framework. He only need to know this page tree is retrieved from cache. And he gets strongly-typed PageCacheItem APIs to use, not the weak-typed "object" retrieved from the underlying caching framework.
So my questions is about designing APIs for plugin authors, that is, is it good or bad to make the API class PageCacheItem mutable (here mutable == properties can be set outside the PageCacheItem class)?
Upvotes: 0
Views: 1174
Reputation: 113282
First, I assume you mean the cached values may or may not be mutable, rather than the identifier it is identified by. If you mean the identifier too, then I would be quite emphatic about being immutable in this regard (emphatic enough to have my post flagged for obscene language).
As for mutable values, there is no one right answer here. You've hit on the primary pro and con either way, and there are multiple variants within each of the two options you describe. Cache invalidation is in general a notoriously difficult problem (as in the well known quote from Phil Karlton, "There are only two hard problems in Computer Science: cache invalidation and naming things."*)
Some things to consider:
I haven't mentioned threading issues, because the threading issues are difficult with any sort of cache unless you're single-threaded (and if its a CMS I'm guessing it's web, and hence inherently multi-threaded). One thing I'll will say on the matter is that it's generally the case that a cache failure isn't critical (by definition, cache failure has a fallback - get the fresh value) for this reason it can be fruitful to take an approach where rather than blocking indefinitely on the monitor (which is what lock
does internally) you use Montior.TryEnter
with a timeout, and have the cache operation fail if the timeout is hit. Using a ReaderWriterLockSlim
and allowing a slightly longer timeout for writing can be a good approach. This way if you get a point of heavy lock contention then the cache will stop working for some threads, but those threads still get usable data. This will suck for performance for those threads, but not as much as lock contention would cause for all affected threads, and caches are a place where it is very easy to introduce lock contention into a web project that only hits once you've gone live.
*(and of course the well known variant, "there are only two hard problems in Computer Science: cache invalidation, naming things, and off-by-one errors").
Upvotes: 4
Reputation: 117250
Look at it this way, if the entry is mutable, then it is likely that the hashcode will change when the object is mutated.
Depending on the dictionary implementation of the cache, it could either:
There may be valid reasons why you want 'mutable hashcodes' but I cannot see a justification here. (I have only ever needed to do this once in the last 9 years).
It would be a lot easier just to remove and replace the entry you wish to be 'mutated'.
Upvotes: 3