Sri Raghavan
Sri Raghavan

Reputation: 611

Django ORM QuerySet intersection by a field

These are the (pseudo) models I've got.

Blog:
  name
  etc...

Article:
  name
  blog
  creator
  etc

User (as per django.contrib.auth)

So my problem is this: I've got two users. I want to get all of the articles that the two users published on the same blog (no matter which blog). I can't simply filter the Article model by both users, because that would yield the set of Articles created by both users. Obviously not what I want. but can I filter somehow to get all of the articles where a field of the object matches between the two querysets?

Upvotes: 2

Views: 2910

Answers (2)

Jarret Hardie
Jarret Hardie

Reputation: 98042

This is a great question for exercising Django's ORM :-)

Depending on which parameters are known in advance, there are a couple ways to approach this.

Scenario 1: You know the users and the specific blog

If you have one particular blog in mind and want to simple find all articles written by either author, you can use a Q object. I don't think this is your scenario, but I'll put it here just in case:

from django.db.models import Q
Article.objects.filter(Q(creator=user1) | Q(creator=user2), blog=some_blog_instance)

Scenario 2: You know only the users

If you want to find all blogs where both users have published, and want to then find which articles they've published in those blogs, you'd want to start from the Blog model:

from django.db.models import Q

# Find all blogs where both user1 and user2 have written articles
blogs = Blog.objects.filter(article__creator=user1).\
        filter(article__creator=user2).distinct()

# Now find which articles those were
for blog in blogs:
    articles = blog.article_set.filter(Q=(creator=user1) | Q=(creator=user2))

Edit based on Paulo's comments:

Here is a test set of models that I believe matches the OP's pseudo code and which demonstrates the above code working (at least on sqlite3 and postgres):

from django.db import models
from django.contrib.auth.models import User

class Blog(models.Model):
    name = models.CharField(max_length=128)

class Article(models.Model):
    name = models.CharField(max_length=128)
    blog = models.ForeignKey(Blog)
    creator = models.ForeignKey(User)

Then some data where both user1 and user2 write articles on blog2:

u1 = User.objects.create(username='u1')
u2 = User.objects.create(username='u2')

b1 = Blog.objects.create(name='b1')
b2 = Blog.objects.create(name='b2')
b3 = Blog.objects.create(name='b3')

b1_art1 = Article.objects.create(name='b1_art1', blog=b1, creator=u1)
b2_art1 = Article.objects.create(name='b2_art1', blog=b2, creator=u1)
b2_art2 = Article.objects.create(name='b2_art2', blog=b2, creator=u2)

The query:

[b.name for b in Blog.objects.filter(article__creator=user1).\
    filter(article__creator=user2).distinct()]

produces:

[b2]

And the SQL is (my test django app was named foo):

SELECT "foo_blog"."id", "foo_blog"."name" 
FROM "foo_blog" 
INNER JOIN "foo_article" ON ("foo_blog"."id" = "foo_article"."blog_id") 
INNER JOIN "foo_article" T4 ON ("foo_blog"."id" = T4."blog_id") 
WHERE ("foo_article"."creator_id" = 1  AND T4."creator_id" = 2 )

So, while the WHERE clause does have (A AND B), A and B reference different inner joins ("foo_article"."creator_id" vs. T4."creator_id"), and not the same table (which is what filter(A, B) would generate, (ie: WHERE ("foo_article"."creator_id" = 1 and "foo_article"."creator_id" = 2))

(Without the distinct() clause, if you added more articles for either author, you'd get multiple b2 entries in the queryset results.)

Like I said, it's a great ORM exercise!

Upvotes: 3

Paulo Scardine
Paulo Scardine

Reputation: 77359

I remember to read something like "Django ORM you get you there 85% of the time, the other 15% is raw SQL" in a presentation by someone from the core team - but I can't find the source anymore.

Your problem seems to fit that 15% of raw SQL.

Upvotes: 1

Related Questions