Alejandroid
Alejandroid

Reputation: 199

Slow query using Django and PostgreSQL with > 40 million rows

I'm working with a model (table) with more than 40 million rows and queries are very slow (> 2 min), using PostgreSQL.

The models:

class ModelA(...):
    source = models.ForeignKey(ModelB, related_name='source', ...)
    target = models.ForeignKey(ModelB, related_name='target', ...)

class ModelB(...):
    c = models.ForeignKey(ModelC, ...)
    ...

The Django filter:

ModelA.objects.filter(source__isnull=True, target__c=my_c)

The query:

SELECT "model_a"."id", ... 
FROM "model_a" 
INNER JOIN "model_b" T3 ON ("model_a"."target_id" = T3."id") 
WHERE ("model_a"."removed" = False AND "model_a"."source_id" IS NULL AND T3."c_id" = 389)

Explain query:

                                                            QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1001.26..2498957.37 rows=12277 width=71)
   Workers Planned: 3
   ->  Nested Loop  (cost=1.26..2496729.67 rows=3960 width=71)
         ->  Parallel Index Scan using model_b_model_c_removed_filename on model_b t3  (cost=0.69..2481771.88 rows=5907 width=4)
               Index Cond: (c_id = 389)
         ->  Index Scan using model_a_source_target_idx on model_a  (cost=0.56..2.52 rows=1 width=71)
               Index Cond: (target_id = t3.id)
               Filter: (NOT removed)
(8 rows)

How can I optimize it?

I am having optimization problems and I would like to know different solutions, etc.

Upvotes: 0

Views: 1657

Answers (1)

Muhammad Faizan Fareed
Muhammad Faizan Fareed

Reputation: 3758

  1. In Django you can use select_related and prefetch_related to optimized your queries.
  2. You can add indexing into your models.
  3. Use Redis or Memcached for caching layer to optimized read access.

What's the difference between select_related and prefetch_related in Django ORM? Stack over flow

select_related Django official doc

prefetch_related Django official doc

Solving Performance Problems in the Django ORM

Django Database Optimization Tips

Upvotes: 1

Related Questions