Reputation: 26086
I have a situation at www.zipstory.com (beta) where I have n-permutations of feeds coming from the same database. For example, someone can get a feed for whatever city they are interested in and as many of these cities all together so its all sorted by most recent or most votes.
How should I cache things for each user without completely maxing out the available memory when there are thousands of users at the same time?
My only guess is don't. I could come up with a client-side caching strategy where I sort out the cities results but this way I could still cache in a one-size fits all strategy by city.
What approaches do you suggest? I'm in unfamiliar ground at this point and could use a good strategy. I noticed this website does not do that but Facebook does. They must be pulling from a pool of cached user-feeds and plucking them in client-side. Not sure, again I'm not smart enough to figure this out just yet.
In other words...
Each city has its own feed. Each user has an n-permutation of city feeds combined.
I would like to see possible solutions to this problem using c# and ASP.NET
Adding to this Febuary 28th, 2013. Here's what I did based on your comments so THANKS!...
This does mean there is some CPU work everytime regardless as I have to combine city lists into a single city list but this avoids going to the database every time and everyone benefits in faster page response times. The main drawback is since I'm not doing a single query UNION on cities before, this requires a single query per city if each was not cached but each city is checked if cached or not individually so 10 queries per 10 cities would only happen if the site is a dead zone.
Upvotes: 4
Views: 1969
Reputation: 8419
Judge the situation based on critical chain points.
If the memory is not a problem, consider having the whole feed cached and retrieving items from there. For this you can use distributed cache solutions. Some of them are even free. Start from the memcached, http://memcached.org/ . People refer to this approach as Load Ahead.
Sometimes memory is a problem if you want to use asp.net cache with expiration and priorities. In such a case, cache can be gone at any point when the memory becomes a problem. Thus, you load the data again on demand (called as Load Through) that affects bandwidth. In such a case, your code should be smarter to get along. If this is an option for you, then try to cache as little as possible. e.g. cache loaded items each and when the user requests a feed, check if all items are present in the cache. If not, you will have to fetch either all or missing ones again. I have done something similar in the past, but cannot provide the code. Key point is: cache entities and then cache feeds with references (IDs) to the entities. Thus, when a particular feed is requested, you check that all references are still valid in the cache. BTW, asp.net provides cache dependencies for such scenarios, so read about that too, - may be helpful.
In any case, have Decorator design pattern in mind when implementing data access layer, that would allow you to: 1 - postpone the caching concerns for the later development phases, and 2 - switch between the two approaches described above depending on how things go. I would start with simpler (and cheaper) built-in solution, and then would switch to the distributed cache solutions when really needed.
Upvotes: 2
Reputation: 3444
Have you considered cache-ing generic feeds, and tag them. And then per user, you just store reference to that tag/keyword.
Another possibility could be to store generic feed, and then filter on client. This will increase your bandwidth, but save cost on cache.
And if you are on HTML5, use Local Storage to hold user preference.
Upvotes: 1
Reputation: 499092
Only cache the minimal amount of different information per user that you need.
For example, if it fits in memory, cache the complete set of feeds and only store, per user, the id's of the feeds they are interested in.
When they request their feeds, just get those out of memory.
Upvotes: 2