Reputation: 17042
I'll most probably be using MemCache for caching some database results. As I haven't ever written and done caching I thought it would be a good idea to ask those of you who have already done it. The system I'm writing may have concurrency running scripts at some point of time. This is what I'm planning on doing:
The potential problem I see in this task is at step 4 and 6. If we have for example 100 sites with big traffic it may happen that the script has a several instances running simultaneously. How could I guarantee that when the cache expires it'll get regenerated once and the data will be intact?
Upvotes: 0
Views: 624
Reputation: 13614
How could I guarantee that when the cache expires it'll get regenerated once and the data will be intact?
The approach to caching I take is, for lack of a better word, a "lazy" implementation. That is, you don't cache something until you retrieve it once, with the hope that someone will need it again. Here's the pseudo code of what that algorithm would look like:
// returns false if there is no value or the value is expired
result = cache_check(key)
if (!result)
{
result = fetch_from_db()
// set it for next time, until it expires anyway
cache_set(key, result, expiry)
}
This works pretty well for what we want to use it for, as long as you use the cache intelligently and understand that not all information is the same. For example, in a hypothetical user comment system, you don't need an expiry time because you can simply invalidate the cache whenever a new user posts a comment on an article, so the next time comments are loaded, they're recached. Some information however (weather data comes to mind) should get a manual expiry time since you're not relying on user input to update your data.
For what its worth, memcache works well in a clustered environment and you should find that setting something like that up isn't hard to do, so this should scale pretty easily to whatever you need it to be.
Upvotes: 2