qwertyui90
qwertyui90

Reputation: 201

Querying ManyToMany in Django giving me duplicate results

I have two models

class Tag(models.Model):
   key = models.CharField(max_length=200)

class Post(models.Model):
   name = models.CharField(max_length=200)
   tags = models.ManyToManyField(Tag)

I m trying to filter out posts with a list of tags. Lets say tags heat and warm. I will get a list of tags in my api function(['heat', 'warm']). I want to filter all Post data which have tags whose keys are in the list. I tried many types and didn't get correct output. Is there a way to do this on single query?

Upvotes: 7

Views: 2393

Answers (3)

Paweł
Paweł

Reputation: 120

Expanding Leo's answer, you don't actually need to create a new "through" model. Django already does it for you (it creates a table that connects 2 models by primary keys). You can retrieve the "through" model like so:

PostTag = Post.tags.through

Then in order to filter by certain tags efficiently and without the use of distinct you can use an Exists subquery combined with an OuterRef expression. See Filtering on a Subquery() or Exists() expressions for more details. In this case it should be something like this:

from django.db.models import Exists, OuterRef

desired_tags = ['heat', 'warm']
PostTag = Post.tags.through
post_tags = PostTag.objects.filter(post=OuterRef('pk'), tag__key__in=desired_tags)
posts_with_tags = Post.objects.filter(Exists(post_tags))

Upvotes: 1

Leo Antunes
Leo Antunes

Reputation: 739

While Wilfried's answer is correct and perfectly fine for a lot of use cases, it's worth noting that on the SQL level, DISTINCT may have a performance impact, especially if you expect you query to match a significant portion of your data (depending on DB and table sizes; see e.g. here and here for more details).

Another slightly more verbose option, which avoids this pitfall, is using a through model and a Subquery (introduced in django 1.11). Based on OPs code:

class Tag(models.Model):
    key = models.CharField(max_length=200)

class Post(models.Model):
    name = models.CharField(max_length=200)
    tags = models.ManyToManyField(Tag, through='Tagging')

class Tagging(models.Model):
    tag = models.ForeignKey(Tag, on_delete=models.CASCADE)
    post = models.ForeignKey(Post, on_delete=models.CASCADE)

and the respective query:

tags = Tagging.objects.filter(tag__key__in=['heat', 'warm'])
Post.objects.filter(pk__in=models.Subquery(tags.values('post__pk')))

Using .explain(analyze=True) (introduced in django 2.1) will help you make an informed decision.

Upvotes: 8

Wilfried
Wilfried

Reputation: 1633

All Post with this tag key equal to heat or warm

Post.objects.filter(tags__key_in=['heat', 'warm']) 

Add distinct to avoid duplicate :

Post.objects.filter(tags__key_in=['heat', 'warm']).distinct()

Upvotes: 7

Related Questions