Recover from task failed beyond max_retries

Jack M. picture Jack M. · Jun 28, 2011 · Viewed 12.4k times · Source

I am attempting to asynchronously consume a web service because it takes up to 45 seconds to return. Unfortunately, this web service is also somewhat unreliable and can throw errors. I have set up django-celery and have my tasks executing, which works fine until the task fails beyond max_retries.

Here is what I have so far:

@task(default_retry_delay=5, max_retries=10)
def request(xml):
    try:
        server = Client('https://www.whatever.net/RealTimeService.asmx?wsdl')
        xml = server.service.RunRealTimeXML(
            username=settings.WS_USERNAME,
            password=settings.WS_PASSWORD,
            xml=xml
        )
    except Exception, e:
        result = Result(celery_id=request.request.id, details=e.reason, status="i")
        result.save()
        try:
            return request.retry(exc=e)
        except MaxRetriesExceededError, e:
            result = Result(celery_id=request.request.id, details="Max Retries Exceeded", status="f")
            result.save()
            raise
    result = Result(celery_id=request.request.id, details=xml, status="s")
    result.save()
    return result

Unfortunately, MaxRetriesExceededError is not being thrown by retry(), so I'm not sure how to handle the failure of this task. Django has already returned HTML to the client, and I am checking the contents of Result via AJAX, which is never getting to a full fail f status.

So the question is: How can I update my database when the Celery task has exceeded max_retries?

Answer

RecursivelyIronic picture RecursivelyIronic · Feb 27, 2016

The issue is that celery is trying to re-raise the exception you passed in when it hits the retry limit. The code for doing this re-raising is here: https://github.com/celery/celery/blob/v3.1.20/celery/app/task.py#L673-L681

The simplest way around this is to just not have celery manage your exceptions at all:

@task(max_retries=10)
def mytask():
    try:
        do_the_thing()
    except Exception as e:
        try:
            mytask.retry()
        except MaxRetriesExceededError:
            do_something_to_handle_the_error()
            logger.exception(e)