Reputation: 354
I am running on Django 2.1.1 and Python 3.6.5 and am performing a reasonably large POST operation (32,000 JSON objects). I have the following:
Model:
class Data(models.Model):
investigation = models.ForeignKey(Investigation)
usage = models.FloatField()
sector = models.CharField(max_length=100, blank=False, default='')
cost = models.FloatField()
demand = models.FloatField()
Serializer:
class DataSerializer(serializers.ModelSerializer):
class Meta:
model = Data
fields = ('investigation', 'usage', 'sector', 'cost', 'demand')
View:
class DataView(generics.CreateAPIView):
def create(self, request, pk, format=None):
data_serializer = DataSerializer(data=request.data, many=True)
if data_serializer.is_valid():
data_serializer.save()
The problems come at both the is_valid() and save() steps which each fire off a separate query for each of the 32,000 objects.
I've spent a long time looking into the issue and I'm guessing that the is_valid() step is slow because of the N+1 query problem since the foreign key is being looked up each time (although I could be wrong about this!) but I have no idea how to implement the prefetch_related method in this framework.
The save() step (which is the slowest part) obviously needs to be done in one query (probably a bulk_create) but I can't find where to add the bulk_create step in. I've read this question but am still none the wiser from the answer. I tried to create a ListSerializer as the question suggests but the objects still seemed to be serialized one by one.
Any pointers would be greatly appreciated.
Upvotes: 4
Views: 964
Reputation: 932
One possible solution is to perform a Django ORM bulk_create()
after you validate the data using your serializer. Your view will then look something like this:
class DataView(generics.CreateAPIView):
def create(self, request, pk, format=None):
data_serializer = DataSerializer(data=request.data, many=True)
if data_serializer.is_valid():
data_objects = []
for data_object_info in data_serializer.validated_data:
data_objects.append(Data(**data_object_info))
Data.objects.bulk_create(data_objects)
or just the following, if you want a one-liner:
Data.objects.bulk_create([Data(**params) for params in data_serializer.validated_data])
If you don't want to clutter your view, then you can write a class or method that performs the validation (using the serializer) and creation. You can then use this inside the view.
Upvotes: 2
Reputation: 118
You can try by overriding the create method of serializer as follows:
def create(self, request):
is_many = True if isinstance(request.data, list) else False
serializer = self.get_serializer(data=request.data, many=is_many)
serializer.is_valid(raise_exception=True)
self.perform_create(serializer)
headers = self.get_success_headers(serializer.data)
return Response(serializer.data, status=status.HTTP_201_CREATED,headers=headers)
Upvotes: -1