Why is setting a django FileField from existing file on the same partition slow?

Question

In my Django application I have to deal with huge files. Instead of uploading them via the web app, the users may place them into a folder (called .dump) on a Samba share and then can choose the file in the Django app to create a new model instance from it. The view looks roughly like this:

class AddDumpedMeasurement(View):
    def get(self, request, *args, **kwargs):
        filename = request.GET.get('filename', None)

        dump_dir = os.path.join(settings.MEDIA_ROOT, settings.MEASUREMENT_DATA_DUMP_PATH)
        in_file = os.path.join(dump_dir, filename)

        if isfile(in_file):
            try:
                with open(in_file, 'rb') as f:
                    object = NCFile.objects.create(sample=sample, created_by=request.user, file=File(f))

                return JsonResponse(data={'redirect': object.get_absolute_url()})
            except:
                return JsonResponse(data={'error': 'Couldn\'t read file'}, status=400)
        else:
            return JsonResponse(data={'error': 'File not found'}, status=400)

As MEDIA_ROOT and .dump are on the same Samba share (which is mounted by the web server), why is moving the file to its new location so slow? I would have expected it to be almost instantaneous. Is it because I open() it and stream the bytes to the file object? If so, is there a better way to move the file to its correct destination and create the model instance?

janoliver · Accepted Answer

Using a temporary file and replacing it with the original one allows one to use os.rename which is fast.

tmp_file = NamedTemporaryFile()
object = NCFile.objects.create(..., file=File(tmp_file))
tmp_file.close()

if isfile(object.file.path):
    os.remove(object.file.path)

new_relative_path = os.path.join(os.path.dirname(object.file.name), filename)

new_relative_path = object.file.storage.get_available_name(new_relative_path)

os.rename(in_file, os.path.join(settings.MEDIA_ROOT, new_relative_path))
object.file.name = new_relative_path
object.save()

Why is setting a django FileField from existing file on the same partition slow?

Answers (2)

Related Questions