peter
peter

Reputation: 15139

How to organize Redis cache for different users with simple write / delete approach

In an application we run some complex queries on a large postgres data set. Since the underlying data is updated only every month, we cache the results (currently in postgres) based on the parameters which were used for calculations. We end up with a simple key/value table:

key - hashed JSON object representing parameters for calculation value - queries JSON result

so far, easy - however, this table exists for each customer, which has their own namespace in postgres.

After a customer uploads new data in the cache table are deleted and repopulated again when queried.

To do this in redis I found different possible approaches:

  1. use one database per customer, according to a comment on google groups, this should be fine up to ~1000 databases if we want to have LRU enabled and even beyond without it. (1000 databases is definitely not limiting for now, it just doesn't feel the right way to do it)
  2. use one hash per customer and treat it as our key/value table. I don't think LRU will be able to remove least used keys from the hashes, instead it would remove caches for the least active customers, which might be ok, but probably we wouldn't have the powerlaw access pattern for the keys which is suggested in order to use LRU
  3. namespace the keys as customer:params_hash and when data needs to be deleted scan through all keys if they match the pattern customer:*. The necessary approach and script is described here: How to atomically delete keys matching a pattern using Redis

The data would be written continuously while the user is working with the application, in order to improve their access on reports / queries which they already performed and reduce the load of the system. The underlying data is updated on average once per month - every second month, and than the cache would get deleted (based on a trigger from postgres using a foreign data wrapper).

Are those 3 the only options? Which one would be the best fitting and would be the way to go in redis?

Upvotes: 4

Views: 3283

Answers (1)

my-nick
my-nick

Reputation: 711

Solution 3 seems to be best (you will be deleting keys once per month). But remember, that using "keys" command isn't recommended on production servers. I think that you don't need to delete those keys atomically and should use: http://redis.io/commands/scan

Also, just benchmark your new solution in production env - because I think that main bottleneck of your old solution isn't caching engine (postgresql, redis, or memcached), but network and decoding. ( http://kiss-web.blogspot.com/ ).

Upvotes: 1

Related Questions