mohammad
mohammad

Reputation: 2198

approximate count of rows in postgresql django

I have a postgresql database connected to with django. In database there is lots of big tables which I want their row count. Because of large size of tables this takes a lot of time to execute.

I founded that the approximate count could be retrieved from pg_class. Is there any way to this in Django and not executing raw query?

Upvotes: 3

Views: 2180

Answers (1)

willeM_ Van Onsem
willeM_ Van Onsem

Reputation: 477318

I propose to use a dedicated package for this named django-postgres-fuzzycount [GitHub]. This package provides a manager that will do the fast counting.

You can install the package with pip (for example in your local environment):

$ pip install django-postgres-fuzzycount

Then you can add the FuzzyCountManager to the models where you want to obtain an approximative count:

from django.db import models
from fuzzycount import FuzzyCountManager

class SomeModel(models.Model):

    #  ... (some fields) ...

    objects = models.Manager()
    approx = FuzzyCountManager()

and then you can count approximatively with:

SomeModel.approx.count()

In case you .filter(..), Django will calculate the real number, since the pg_class table only stores an apprixmative number of rows for the entire table, so:

SomeModel.approx.filter(foo=bar).count()

will take more time (depending on indexes, etc.).

You can also "patch" the objects manager directly, but then obtaining the real number of records might be harder:

from django.db import models
from fuzzycount import FuzzyCountManager

class SomeModel(models.Model):

    #  ... (some fields) ...

    objects = FuzzyCountManager()

It is also nice that if you change the backend database to another database, the FuzzyCountManager(..) will act as a normal Manager, so in case you later change the database system, you do not have to rewrite the managers.

Upvotes: 3

Related Questions