Reputation: 8788
Using django with a MySQL DB and given these models:
ModelB ---FK---> ModelA
- ref_type
- ref_id
ModelC
I want to get all the ModelC for each ModelA via an annotation.
I tried many options looking at existing solutions but could not make it work.
The following code works when there is just one ModelC for each ModelA but as soon as there is more than one, I get the Subquery returns more than 1 row
error and I don't know how to get a list of the ModelC models instead. Ideally, I'd like to build a list of JSON objects of the ModelC.
qs = ModelA.objects.all()
c_ids = (
ModelB.objects \
.filter(modela_id=OuterRef(OuterRef('id')), ref_type='c') \
.values('ref_id')
)
all_c = (
ModelC.objects \
.filter(id__in=Subquery(c_ids)) \
.values('id')
)
qs1 = qs.annotate(all_c=Subquery(all_c ))
for p in qs1:
print(p, p.all_c)
Upvotes: 3
Views: 1418
Reputation: 5116
The following should do
from django.db.models import JSONField
from django.db.models.aggregates import Aggregate
class JSONArrayAgg(Aggregate):
function = "JSON_ARRAYAGG"
output_field = JSONField()
ModelA.objects.annotate(
all_c=Subquery(
ModelB.objects.filter(
ref_type="c",
modela_id=OuterRef("id"),
).values(
"modela_id"
).values_list(
JSONArrayAgg("ref_id")
)
)
)
which translates to
SELECT
model_a.*,
(SELECT JSON_ARRAYAGG(model_b.ref_id)
FROM model_b
WHERE model_b.ref_type = "c" AND model_b.modela_id = model_a.id
GROUP BY model_b.modela_id
) all_c
FROM model_a
But it would be much easier if you provided your exact model definition as it's likely only a matter of doing something along the lines of (JSONArrayAgg.filter
cannot be used due to a MySQL bug.
ModelA.objects.filter(
modelb_set__ref_type="c",
).annotate(
all_c=JSONArrayAgg("modelb_set__ref_id")
)
which translate to
SELECT
model_a.*,
JSON_ARRAYAGG(model_b.ref_id)
FROM model_a
INNER JOIN model_b ON (model_b.modela_id = model_a.id)
WHERE model_b.ref_type = "c"
GROUP BY model_a.id
You could also use FilteredRelation
if you want the condition to be pushed to the JOIN
instead.
ModelA.objects.annotate(
all_c_rel=FilteredRelation(
"modelb_set", Q(modelb_set__ref_type="c")
),
all_c=JSONArrayAgg("all_c_rel__ref_id")
)
Which results in
SELECT
model_a.*,
JSON_ARRAYAGG(model_b.ref_id)
FROM model_a
LEFT OUTER JOIN model_b ON (
model_b.modela_id = model_a.id
AND model_b.ref_type = "c"
)
GROUP BY model_a.id
But the LEFT OUTER JOIN
might re-surface the issue you have with MySQL's handling of NULL
in JSON_ARRAYAGG
.
Upvotes: 2
Reputation: 1039
I come from the assumption that Model B is indeed a through table for M2M relationship between Model A and Model C as Işık Kaplan suggested.
In Postgres you could use ArrayAgg like Işık Kaplan suggested. Equivalent in MySQL in GROUP_CONCAT but it is not present in the ORM out of the box. Also from personal experience I wouldn't recommend it as it performed terribly in my use case.
What I ended up doing was combining 2 queries using Python which was way faster then 1 complicated query with GROUP_CONCAT (around 60K records of "Model A" and 20K of "Model B" in my case). In your case it would look like this:
a_qs = ModelA.objects.all()
c_ids_dict = defaultdict(list)
c_ids = a_qs.values("id", "models_c_objects__id")
for item in c_ids:
if item["models_c_objects__id"]:
c_ids_dict[item["id"]].append(item["models_c_objects__id"])
for p in a_qs:
print(p, c_ids_dict.get(p.id, []))
Upvotes: 0
Reputation: 3002
ModelB looks like a junction table. Having an id pointing to A and C
Django supports junction tables.
But when it comes to annotation the objects with the list of ids, I'm not entire sure if that is possible purely by the ORM.
class ModelA(models.Model):
model_c_objects = models.ManyToManyField("ModelC", through="ModelB")
class ModelB(models.Model):
model_a = models.ForeignKey(ModelA, on_delete=models.CASCADE)
model_b = models.ForeignKey(ModelB, on_delete=models.CASCADE)
class ModelC(models.Model):
...
# This one here I have no idea if it would work or not
ModalA.objects.prefetch_related("models_c_objects").annotate(model_c_object_ids=ArrayAgg("model_c_objects__id")
# If it doesn't:
class ModelA(models.Model):
model_c_objects = models.ManyToManyField("ModelC", through="ModelB")
@property
def model_c_object_ids(self):
return list(self.model_c_objects.values("id", flat=True))
# And you can then use it like you wished
for model_a_object in ModelA.objects.prefetch_related("models_c_objects"):
model_a_object.model_c_object_ids # list of model_c ids like: [1,4,12,63]
I'm feeling a bit lazy but either of the two solutions should work and they both use a single query.
Upvotes: 0