Increment a counter and trigger an action when a threshold is exceeded

Question

I have a model like this

class Thingy(models.Model):
    # ...
    failures_count = models.IntegerField()

I have concurrent processes (Celery tasks) that need to do this:

do some kind of processing
if the processing fails increment failures_counter of the respective Thingy
if failures_counter exceeds a threshold for some Thingy, issue a warning, but ONLY ONE warning.

I have some ideas about how to do this without a race condition, for example using explicit locks (via select_for_update):

@transaction.commit_on_success
def report_failure(thingy_id):
    current, = (Thingy.objects
               .select_for_update()
               .filter(id=thingy_id)
               .values_list('failures_count'))[0]
    if current == THRESHOLD:
        issue_warning_for(thingy_id)
    Thingy.objects.filter(id=thingy_id).update(
        failures_count=F('failures_count') + 1
    )

Or by using Redis (it's already there) for synchronization:

@transaction.commit_on_success
def report_failure(thingy_id):
    Thingy.objects.filter(id=thingy_id).update(
        failures_count=F('failures_count') + 1
    )
    value = Thingy.objects.get(id=thingy_id).only('failures_count').failures_count
    if value >= THRESHOLD:
        if redis.incr('issued_warning_%s' % thingy_id) == 1:
            issue_warning_for(thingy_id)

Both solutions use locks. As I'm using PostgreSQL, is there a way to achieve this without locking?

I'm editing the question to include the answer (thanks to Sean Vieira, see answer below). The question asked about a way to avoid locking and this answer is optimal in that it leverages multi-version concurrency control (MVCC) as implemented by PostgreSQL.

This specific question explicitly allowed using PostgreSQL features, and though many RDBMSs implement UPDATE ... RETURNING, it is not standard SQL and is not supported by Django's ORM out of the box, so it requires using raw SQL via raw(). The same SQL statement will work in other RDBMSs but every engine requires its own discussion regarding synchronization, transactions isolation and concurrency models (e.g. MySQL with MyISAM would still use locks).

def report_failure(thingy_id):
    with transaction.commit_on_success():
        failure_count = Thingy.objects.raw("""
            UPDATE Thingy
            SET failure_count = failure_count + 1
            WHERE id = %s
            RETURNING failure_count;
        """, [thingy_id])[0].failure_count

    if failure_count == THRESHOLD:
        issue_warning_for(thingy_id)

Sean Vieira · Accepted Answer

As far as I can tell Django's ORM doesn't support this out of the box - however, that doesn't mean it can't be done, you just need to dip down to the SQL level (exposed, in Django's ORM via a Manager's raw method) to make it work.

If you are using PostgresSQL >= 8.2 then you can use RETURNING to get the final value for failure_count without any additional locking (the DB will still lock, but only long enough to set the value, no additional time lost communicating with you):

# ASSUMPTIONS: All IDs are valid and IDs are unique
# More defenses are necessary if either of these assumptions
# are not true.
failure_count = Thingy.objects.raw("""
    UPDATE Thingy
    SET failure_count = failure_count + 1
    WHERE id = %s
    RETURNING failure_count;
""", [thingy_id])[0].failure_count

if failure_count == THRESHOLD:
    issue_warning_for(thingy_id)

Increment a counter and trigger an action when a threshold is exceeded

Answers (2)

Related Questions