guettli
guettli

Reputation: 27855

Invalidating several grouped cache keys

I have model TicketType which has about 500 instances.

It changes only a few times per week.

But if it changes, I need to invalidate all cached values which used the old TicketTypes.

Unfortunately some cache keys are not fixed. They contain computed data.

I see these solutions:

Use the version argument and update the version value on a post save signal handler of TicketType.

Use a common prefix for all cache keys which are based on TicketType. Then invalidate all cache keys in a post save signal handler.

I guess there is a third, and better way ...

Example:

TicketType is a tree. Visibility of TicketTypes are bound to permissions. Two users might see the tree in a different way, if they have different permissions. We cache it, according to the permissions. The permissions of a user gets serialized and hashed. The cache key gets created by creating a string which contains the hash and a fixed part:

hash_key='ticket-type-tree--%s' % hashed_permissions

If the TicketType tree changes, we need to be sure, that no old data gets loaded from the cache. Active invalidating is not needed, as long as no old data gets used.

Upvotes: 6

Views: 1243

Answers (5)

Keith Brings
Keith Brings

Reputation: 381

Ah this is how I managed scenarios like this in the past. It requires higher cache read/writes to build final cache keys but if those requests are faster then prematurely invalidating records it's not too bad of an overhead. https://github.com/noizu/fragmented-keys

Upvotes: 1

Ahmed
Ahmed

Reputation: 3012

Use redis to cache your models

The way I would cache my instances would be the following:

1-Make sure you are getting one item at the time. E.g: Model.objects.get(foo='bar'), and you're using attribute foo every time to get the model from the database. That will be used to make sure the data get invalidated later.

2-Override method save() and make sure it saves the data to the cache by using foo attribute.

E.g:

class Model(model.Model):
    foo = models.CharField()
    bar = models.CharField()

    def save(self, *args, **kwargs):
        redis.set(foo, serialize_model())
        super(Model, self).save(*args, **kwargs)

    def serialize_model():
        return serilized_object

3-Override get method to get the serialized object before hitting the database.

E.g:

class Model(model.Model):
    ...
    def get(self, *args, **kwargs):
        if redis.get(self.foo):
            return redis.get(self.foo)
        else:
            return super(Model).get(*args, **kwargs)

4-Override your delete method to remove the cache in case if the instance was removed or deleted

E.g

class Model(model.Model):
    ...
    def delete(self,*args, **kwargs):
        redis.delete(self.foo)
        super(Model, self).delete(*args, **kwargs)

Replace Model class with your model, in this case it would be Ticket Type

One thing, I'm assuming you will not touch the database outside your Django app. If you're using raw sql in any other place, this will not work.

Look for redis functions on their website, they have a function to delete, set, and get. If you're using other caching way. Look for how to set, get and delete.

Upvotes: 1

Lyudmil Nenov
Lyudmil Nenov

Reputation: 257

In TicketType post save signal handler:
a) generate key depending on permissions of all users and invalidate keys
b) generate key for every permutation(permission) (if you can calculate them) and invalidate the keys
c) use a second memcached instance to store these cache only and clear it (easiest)

P.S.: Pro tip would be to refresh the caches instead of just invalidating them. However, an uncaught exception in a django signal can be trouble, so be weary

Upvotes: 0

dnozay
dnozay

Reputation: 24324

You can use the ticket modification time as part of your cache key.

hash_key = 'ticket-type-tree--%s-%s' % (hashed_permissions, tree.lastmodified)

You can add a DateTimeField with auto_now=True. If getting the modification time from the db is too expensive, you may cache that as well.

Usually, updating the cache in a post_save signal handler is fine. Unless you want to have consistent data at all times and want to pay the extra cost for transactions.

Upvotes: 1

Marcanpilami
Marcanpilami

Reputation: 584

Well, basically you issue is simply cache key expressiveness. When you have to do something as complicated as hashing a set to get the key, it must be a hint there is something missing.

In you case, I believe what's missing is simply a "permission set" object. You could call it a group, a role (as in RBAC)... That's why I asked you if sets were repetitive - actually, your hash key is simply a way of recreating the ID of the set object that does not exist.

So a solution would be:

  • to create a role model, with a M2M rel to users and a M2M rel to permissions (which as I understand are linked to your TicketTypes)
  • use an event handler to catch saves to TicketType.
    • fetch all impacted roles (through permissions)
    • generate the keys (something like ticket-type-TREEID-ROLEID) and invalidate them

Two final remarks:

  • Sometimes cache.clear() is the solution - especially if you don't use the cache for anything else
  • You say your SQL query count is huge when navigating the tree. If you do not have already tried so, you may simply wan to optimize that with prefetch and select_related (see the doc).

Upvotes: 0

Related Questions