Reputation: 53873
I've used Django REST Framework to expose an API which is only used by another service to POST new data. It basically just takes json and inserts it in the DB. That's all.
It's quite a high volume data source (sometimes more than 100 records/second), so I need to tune it a bit.
So I was logging the (PostgreSQL) queries that are run, and I see that every POST gives 3 queries:
2019-10-01 11:09:03.320 CEST [23983] postgres@thedb LOG: statement: SET TIME ZONE 'UTC'
2019-10-01 11:09:03.322 CEST [23983] postgres@thedb LOG: statement: SELECT (1) AS "a" FROM "thetable" WHERE "thetable"."id" = 'a7f74e5c-7cad-4983-a909-49857481239b'::uuid LIMIT 1
2019-10-01 11:09:03.363 CEST [23983] postgres@thedb LOG: statement: INSERT INTO "thetable" ("id", "version", "timestamp", "sensor", [and 10 more fields...]) VALUES ('a7f74e5c-7cad-4983-a909-49857481239b'::uuid, '1', '2019-10-01T11:09:03.313690+02:00'::timestamptz, 'ABC123', [and 10 more fields...])
I tuned the DB for INSERT
s to be fast, but SELECT
s are slow. So I would like to remove the SELECT from the system. I added this line to the Serializer:
id = serializers.UUIDField(validators=[])
But it still does a SELECT. Does anybody know how I can prevent the SELECT
from happening?
For complete info; the full Serializer now looks like this:
import logging
from rest_framework import serializers
from .models import TheData
log = logging.getLogger(__name__)
class TheDataSerializer(serializers.HyperlinkedModelSerializer):
class Meta:
model = TheData
fields = [
'id',
'version',
'timestamp',
'sensor',
[and 10 more fields...]
]
class TheDataDetailSerializer(serializers.ModelSerializer):
id = serializers.UUIDField(validators=[])
class Meta:
model = TheData
fields = '__all__'
And as requested by frankie567, the ViewSet
:
class TheDataViewSet(DetailSerializerMixin, viewsets.ModelViewSet):
serializer_class = serializers.TheDataSerializer
serializer_detail_class = serializers.TheDataDetailSerializer
queryset = TheData.objects.all().order_by('timestamp')
http_method_names = ['post', 'list', 'get']
filter_backends = [DjangoFilterBackend]
filter_class = TheDataFilter
pagination_class = TheDataPager
def get_serializer(self, *args, **kwargs):
""" The incoming data is in the `data` subfield. So I take it from there and put
those items in root to store it in the DB"""
request_body = kwargs.get("data")
if request_body:
new_request_body = request_body.get("data", {})
new_request_body["details"] = request_body.get("details", None)
request_body = new_request_body
kwargs["data"] = request_body
serializer_class = self.get_serializer_class()
kwargs['context'] = self.get_serializer_context()
return serializer_class(*args, **kwargs)
Upvotes: 3
Views: 478
Reputation: 1760
After some digging, I was able to see where this behaviour comes from. If you look at Django Rest Framework code:
if getattr(model_field, 'unique', False): unique_error_message = model_field.error_messages.get('unique', None) if unique_error_message: unique_error_message = unique_error_message % { 'model_name': model_field.model._meta.verbose_name, 'field_label': model_field.verbose_name } validator = UniqueValidator( queryset=model_field.model._default_manager, message=unique_error_message) validator_kwarg.append(validator)
We see that if unique
is True
(which is in your case, as I guess you defined your UUID field as primary key), DRF adds automatically a UniqueValidator
. This validator performs a SELECT
request to check if the value doesn't already exist.
It is appended to the ones you are defining in the validators
parameter of the field, so that's why what you did has no effect.
So, how do we circumvent this?
First attempt
class TheDataDetailSerializer(serializers.ModelSerializer):
# ... your code
def get_fields(self):
fields = super().get_fields()
fields['id'].validators.pop()
return fields
Basically, we remove the validators of the id
field after they have been generated. There are surely more clever ways to do this. It seems to me though that DRF may be too opinionated on this matter.
Second attempt
class TheDataDetailSerializer(serializers.ModelSerializer):
# ... your code
def build_standard_field(self, field_name, model_field):
field_class, field_kwargs = super().build_standard_field(field_name, model_field)
if field_name == 'id':
field_kwargs['validators'] = []
return field_class, field_kwargs
When generating the field arguments, set an empty validators
list if we are generating the id
field.
Upvotes: 1