Stuart Buckingham
Stuart Buckingham

Reputation: 1784

Django + Celery "cannot serialize '_io.BufferedReader' object"

While trying to pass a file off to a Celery task, I am sometimes getting the exception "cannot serialize '_io.BufferedReader' object". This seems to be happening with some files and not others. The endpoint is an APIView with the following to launch the task:

from celery import signature
task = signature(
        data.get('action'),
        kwargs={'data': data,
                'authorization': authorization,
                "files": files}
    ).apply_async()

It does work fine when some files are included in the request, but throws the exception for other files.

Upvotes: 1

Views: 1961

Answers (1)

Stuart Buckingham
Stuart Buckingham

Reputation: 1784

The blocker was the FileHandler. When a larger file is uploaded, Django invokes the TemporaryFileUploadHandler to create a TemporaryUploadedFile stored and streamed from disk. This type of file is not serializable by pickle, and so pickle/kombu throws the "cannot serialize '_io.BufferedReader' object" exception.

The solution was to set the value of FILE_UPLOAD_MAX_MEMORY_SIZE in settings.py to a high value (100MB) so that large files (<100MB) would become InMemoryUploadedFiles, and also write a check into the view that returns a more useful error:

from rest_framework.response import Response
from rest_framework.status import HTTP_413_REQUEST_ENTITY_TOO_LARGE
from django.core.files.uploadedfile import TemporaryUploadedFile

# If it is not an in memory file, we cannot pickle it.
if any([isinstance(x, TemporaryUploadedFile) for x in files.values()]):
    return Response(
        'File too large to upload.',
        status=HTTP_413_REQUEST_ENTITY_TOO_LARGE
    )

Not 100% sure that HTTP 413 is the most appropriate status code, but it makes sense to me, and the description should also help end users.

Upvotes: 2

Related Questions