Quintin Par
Quintin Par

Reputation: 16252

Retry Celery tasks with exponential back off

For a task like this:

from celery.decorators import task

@task()
def add(x, y):
    if not x or not y:
        raise Exception("test error")
    return self.wait_until_server_responds(

if it throws an exception and I want to retry it from the daemon side, how can apply an exponential back off algorithm, i.e. after 2^2, 2^3,2^4 etc seconds?

Also is the retry maintained from the server side, such that if the worker happens to get killed then next worker that spawns will take the retry task?

Upvotes: 89

Views: 49779

Answers (3)

Rupert Angermeier
Rupert Angermeier

Reputation: 1015

As of Celery 4.2 you can configure your tasks to use an exponential backoff automatically: https://docs.celeryq.dev/en/stable/userguide/tasks.html#automatic-retry-for-known-exceptions

@app.task(autoretry_for=(Exception,), retry_backoff=2)
def add(x, y):
    ...

(This was already in the docs for Celery 4.1 but actually wasn't released then, see merge request)

Upvotes: 59

lgylym
lgylym

Reputation: 206

FYI, celery has a util function to calculate exponential backoff time with jitter here, so you don't need to write your own.

def get_exponential_backoff_interval(
    factor,
    retries,
    maximum,
    full_jitter=False
):
    """Calculate the exponential backoff wait time."""
    # Will be zero if factor equals 0
    countdown = min(maximum, factor * (2 ** retries))
    # Full jitter according to
    # https://www.awsarchitectureblog.com/2015/03/backoff.html
    if full_jitter:
        countdown = random.randrange(countdown + 1)
    # Adjust according to maximum wait time and account for negative values.
    return max(0, countdown)

Upvotes: 3

asksol
asksol

Reputation: 19499

The task.request.retries attribute contains the number of tries so far, so you can use this to implement exponential back-off:

from celery.task import task

@task(bind=True, max_retries=3)
def update_status(self, auth, status):
    try:
        Twitter(auth).update_status(status)
    except Twitter.WhaleFail as exc:
        raise self.retry(exc=exc, countdown=2 ** self.request.retries)

To prevent a Thundering Herd Problem, you may consider adding a random jitter to your exponential backoff:

import random
self.retry(exc=exc, countdown=int(random.uniform(2, 4) ** self.request.retries))

Upvotes: 162

Related Questions