Reputation: 552
If we have 2 models A, B with a many to many relation.
I want to obtain a sql query similar to this:
SELECT *
FROM a LEFT JOIN ab_relation
ON ab_relation.a_id = a.id
JOIN b ON ab_relation.b_id = b.id;
So in django when I try:
A.objects.prefetch_related('bees')
I get 2 queries similar to:
SELECT * FROM a;
SELECT ab_relation.a_id AS prefetch_related_val_a_id, b.*
FROM b JOIN ab_relation ON b.id = ab_relation.b_id
WHERE ab_relation.a_id IN (123, 456... list of all a.id);
Given that A and B have moderately big tables, I find the way django does it too slow for my needs.
The question is: Is it possible to obtain the left join manually written query through the ORM?
Edits to answer some clarifications:
Yes a LEFT OUTER JOIN
would be preferable to get all A's in the queryset, not only those with a relation with B (updated sql).
Moderately big means ~4k rows each, and too slow means ~3 seconds (on first load, before redis cache.) Keep in mind there are other queries on the page.
Actually yes, we need only B.one_field but having tried with Prefetch('bees', queryset=B.objects.values('one_field'))
an error said you can't use values
in a prefetch.
The queryset will be used as options for a multi-select form-field, where we will need to represent A objects that have a relation with B with an extra string from the B.field.
Upvotes: 5
Views: 7311
Reputation: 51715
For the direct answer skip to point 6)
Let'ts talk step by step.
1) N:M select. You say you want a query like this:
SELECT *
FROM a JOIN ab_relation ON ab_relation.a_id = a.id
JOIN b ON ab_relation.b_id = b.id;
But this is not a real N:M query, because you are getting only A-B related objects The query should use outer joins
. At least like:
SELECT *
FROM a left outer JOIN
ab_relation ON ab_relation.a_id = a.id left outer JOIN
b ON ab_relation.b_id = b.id;
In other cases you are getting only A
models with a related B
.
2) Read big tables You say "moderately big tables". Then, are you sure you want to read the whole table from database? This is not usual on a web environment to read a lot of data, and, in this case, you can paginate data. May be is not a web app? Why you need to read this big tables? We need context to answer your question. Are you sure you need all fields from both tables?
3) Select * from Are you sure you need all fields from both tables? May be if you read only some values this query will run faster.
A.objects.values( "some_a_field", "anoter_a_field", "Bs__some_b_field" )
4) As summary. ORM is a powerful tool, two single read operations are "fast". I write some ideas but perhaps we need more context to answer your question. What means moderate big tables, wheat means slow, what are you doing with this data, how many fields or bytes has each row from each table, ... .
Editedd Because OP has edited the question.
5) Use right UI controls. You say:
The queryset will be used as options for a multi-select form-field, where we will need to represent A objects that have a relation with B with an extra string from the B.field.
It looks like an anti-pattern to send to client 4k rows for a form. I suggest to you to move to a live control that loads only needed data. For example, filtering by some text. Take a look to django-select2 awesome project.
6) You say
The question is: Is it possible to obtain the left join manually written query through the ORM?
The answer is: Yes, you can do it using values
, as I said it on point 3. Sample: Material
and ResultatAprenentatge
is a N:M relation:
>>> print( Material
.objects
.values( "titol", "resultats_aprenentatge__codi" )
.query )
The query:
SELECT "material_material"."titol",
"ufs_resultataprenentatge"."codi"
FROM "material_material"
LEFT OUTER JOIN "material_material_resultats_aprenentatge"
ON ( "material_material"."id" =
"material_material_resultats_aprenentatge"."material_id" )
LEFT OUTER JOIN "ufs_resultataprenentatge"
ON (
"material_material_resultats_aprenentatge"."resultataprenentatge_id" =
"ufs_resultataprenentatge"."id" )
ORDER BY "material_material"."data_edicio" DESC
Upvotes: 7