Reputation: 1497
The default behavior of the ListAPIView (code below) is to serialize all Report objects and the nested Log objects per Report object. What if I only want the latest Log object to be displayed per Report? How do I go about doing that?
# models.py
class Log(models.Model):
# ...
report = models.ForeignKey(Report)
timestamp = models.DateTimeField(default=datetime.datetime.now)
class Report(models.Model):
code = models.CharField(max_length=32, unique=True)
description = models.TextField()
# serializers.py
class LogSerializer(serializers.ModelSerializer):
class Meta:
model = Log
class ReportSerializer(serializers.ModelSerializer):
log_set = LogSerializer(many=True, read_only=True)
class Meta:
model = Report
fields = ('code', 'description', 'log_set')
# views.py
class ReportListView(generics.ListAPIView):
queryset = Report.objects.all()
serializer_class = ReportSerializer
I know I can do this by using a SerializerMethodField, but this can be a potentially expensive operation, since there will be an extra SQL query to retrieve the appropriate Log object for each Report object.
class ReportSerializer(serializers.ModelSerializer):
latest_log = serializers.SerializerMethodField()
class Meta:
model = Report
def get_latest_log(self, obj):
try:
latest_log = Log.objects.filter(report_id=obj.id).latest('timestamp')
except Log.DoesNotExist:
latest_log = None
return latest_log
If I have 1000 report objects, there will be 1000 extra queries if I want to render them all. How do I avoid those extra queries besides using pagination? Can anyone point me to the right direction? Thanks!
EDIT: Regarding the possible duplicate tag, the link alone provided by Mark did not completely clear up the picture for me. Todor's answer was more clear.
Upvotes: 4
Views: 2301
Reputation: 16050
You need to somehow annotate the latest_log
in the ReportQuerySet
, so it can be used by the serializer without making any extra queries.
The simplest way to achieve this is by prefetching
all the logs
per report
. The drawback of this approach is that you load in memory all the logs
per report
per page. Which is not so bad if one report
got something like 5-10-15 logs
. This will mean that for a page with 50 reports
, you are gonna load 50*10=500 logs
which is not a big deal. If there are more logs
per report
(lets say 100) then you need to make additional filtering of the queryset
.
Here is some example code:
Prefetch the logs
.
# views.py
class ReportListView(generics.ListAPIView):
queryset = Report.objects.all()\
.prefetch_related(Prefetch('log_set',
queryset=Log.objects.all().order_by('-timestamp'),
to_attr='latest_logs'
))
serializer_class = ReportSerializer
Create a helper method for easy access the latest_log
class Report(models.Model):
#...
@property
def latest_log(self):
if hasattr(self, 'latest_logs') and len(self.latest_logs) > 0:
return self.latest_logs[0]
#you can eventually implement some fallback logic here
#to get the latest log with a query if there is no cached latest_logs
return None
Finally the serializer just use the property
class ReportSerializer(serializers.ModelSerializer):
latest_log = serializers.LogSerializer()
class Meta:
model = Report
An example of a more advanced filtering of the logs
can be something like this:
Report.objects.all().prefetch_related(Prefetch('log_set', queryset=Log.objects.all().extra(where=[
"`myapp_log`.`timestamp` = (\
SELECT max(timestamp) \
FROM `myapp_log` l2 \
WHERE l2.report == `myapp_log`.`report`\
)"]
), to_attr='latest_logs'
))
Upvotes: 5
Reputation: 2816
You can use select related argument. It will hit the database only once using JOIN.
class ReportListView(generics.ListAPIView):
queryset = Report.objects.select_related('log');
serializer_class = ReportSerializer
Upvotes: -1