dVeza
dVeza

Reputation: 552

Django ORM join many to many relation in one query

If we have 2 models A, B with a many to many relation.

I want to obtain a sql query similar to this:

SELECT *
FROM a LEFT JOIN ab_relation 
ON ab_relation.a_id = a.id
JOIN b ON ab_relation.b_id = b.id;

So in django when I try:

A.objects.prefetch_related('bees')

I get 2 queries similar to:

SELECT * FROM a;
SELECT ab_relation.a_id AS prefetch_related_val_a_id, b.* 
FROM b JOIN ab_relation ON b.id = ab_relation.b_id
WHERE ab_relation.a_id IN (123, 456... list of all a.id);

Given that A and B have moderately big tables, I find the way django does it too slow for my needs.

The question is: Is it possible to obtain the left join manually written query through the ORM?

Edits to answer some clarifications:

Upvotes: 5

Views: 7311

Answers (1)

dani herrera
dani herrera

Reputation: 51715

For the direct answer skip to point 6)

Let'ts talk step by step.

1) N:M select. You say you want a query like this:

SELECT *
FROM a JOIN ab_relation ON ab_relation.a_id = a.id
JOIN b ON ab_relation.b_id = b.id;

But this is not a real N:M query, because you are getting only A-B related objects The query should use outer joins. At least like:

SELECT *
FROM a left outer JOIN 
     ab_relation ON ab_relation.a_id = a.id left outer JOIN 
     b ON ab_relation.b_id = b.id;

In other cases you are getting only A models with a related B.

2) Read big tables You say "moderately big tables". Then, are you sure you want to read the whole table from database? This is not usual on a web environment to read a lot of data, and, in this case, you can paginate data. May be is not a web app? Why you need to read this big tables? We need context to answer your question. Are you sure you need all fields from both tables?

3) Select * from Are you sure you need all fields from both tables? May be if you read only some values this query will run faster.

A.objects.values( "some_a_field", "anoter_a_field", "Bs__some_b_field" )

4) As summary. ORM is a powerful tool, two single read operations are "fast". I write some ideas but perhaps we need more context to answer your question. What means moderate big tables, wheat means slow, what are you doing with this data, how many fields or bytes has each row from each table, ... .

Editedd Because OP has edited the question.

5) Use right UI controls. You say:

The queryset will be used as options for a multi-select form-field, where we will need to represent A objects that have a relation with B with an extra string from the B.field.

It looks like an anti-pattern to send to client 4k rows for a form. I suggest to you to move to a live control that loads only needed data. For example, filtering by some text. Take a look to django-select2 awesome project.

6) You say

The question is: Is it possible to obtain the left join manually written query through the ORM?

The answer is: Yes, you can do it using values, as I said it on point 3. Sample: Material and ResultatAprenentatge is a N:M relation:

>>> print( Material
          .objects
          .values( "titol", "resultats_aprenentatge__codi" )
          .query )

The query:

SELECT "material_material"."titol", 
       "ufs_resultataprenentatge"."codi" 
FROM   "material_material" 
       LEFT OUTER JOIN "material_material_resultats_aprenentatge" 
                    ON ( "material_material"."id" = 
"material_material_resultats_aprenentatge"."material_id" ) 
LEFT OUTER JOIN "ufs_resultataprenentatge" 
ON ( 
"material_material_resultats_aprenentatge"."resultataprenentatge_id" = 
"ufs_resultataprenentatge"."id" ) 
ORDER  BY "material_material"."data_edicio" DESC 

Upvotes: 7

Related Questions