Divick
Divick

Reputation: 1273

Memory leak with Django + Django Rest Framework + mod_wsgi

I have the following code where I have a function based view which uses a ModelSerializer to serialize data. I am running this with apache + mod_wsgi (with 1 worker thread, 1 child threads and 1 mod_wsgi threads for the sake of simplicity).

With this, my memory usage shoots up significantly (200M - 1G based on how large the query is) and stays there and does not come down even on request completion. On subsequent requests to the same view/url, the memory increases slightly everytime but does not take a significant jump. To rule out issues with django-filter, I have modified my view and have written filtering query myself.

The usual suspect that DEBUG=True is ruled out as I am not running in DEBUG mode. I have even tried to use guppy to see what is happening but I was unable to get far with guppy. Could someone please help why the memory usage is not down after the request is completed and how to go about debugging it?

Update: I am using default CACHE setting i.e. I have not defined it at all, in which case I presume it is going to use Local Memory for cache as mentioned in the docs.

CACHES = {
    'default': {
        'BACKEND': 'django.core.cache.backends.locmem.LocMemCache',
    }
}



class MeterData(models.Model):
    meter = models.ForeignKey(Meter)
    datetime = models.DateTimeField()

    # Active Power Total
    w_total = models.DecimalField(max_digits=13, decimal_places=2,
                                  null=True)
    ...


class MeterDataSerializer(serializers.ModelSerializer):
    class Meta:
        model = MeterData
        exclude = ('meter', )


@api_view(['GET', ])
@permission_classes((AllowAny,))
def test(request):
    startDate = request.GET.get('startDate', None)
    endDate = request.GET.get('endDate', None)
    meter_pk = request.GET.get('meter', None)
    # Writing query ourself instead of using django-filter to
    # to keep things simple.
    queryset = MeterData.objects.filter(meter__pk=meter_pk,
                                        datetime__gte=startDate,
                                        datetime__lte=endDate)


    logger.info(queryset.query)
    kwargs = {}
    kwargs['context'] = {
        'request': request,
        'view': test,
        'format': 'format',
    }
    kwargs['many'] = True

    serializer = MeterDataSerializer(queryset, **kwargs)
    return Response(serializer.data)

Upvotes: 3

Views: 2044

Answers (1)

Sayse
Sayse

Reputation: 43300

Whilst I can't say for certain, I'll add this as an answer anyway to be judged on it...

As you know, django's default cache is the LocMemCache

Which in the above docs you'll find:

Note that each process will have its own private cache instance

And I think this is all you're seeing. The jump in memory is just the storage of your query. I'd think you only need to be concerned if this memory usage continued to grow beyond a normalcy.

The same doc also says it might not be very viable in production so it might be time to move beyond this, which would also allow you to see if caching was the culprit.

Upvotes: 3

Related Questions