MatthewKremer
MatthewKremer

Reputation: 1569

Django ~Q with Joins Functioning Incorrectly. Bug?

We're encountering a very strange problem regarding the negation of Q objects in Django. Let's just use Football as the example:

class Team(models.Model):
    id = UUIDField(primary_key=True)

class Player(models.Model):
    id = UUIDField(primary_key=True)
    name = models.CharField(max_length=128)
    team = models.ForeignKey(Team)
    touchdowns = models.IntegerField()

There are 10 teams.

There are 100 players, with 10 on each team. Each team has a player named "Joe". There is one "Joe" on one team who has scored 5 touchdowns. All other Joe's have scored 1 touchdown. There are 8 teams where every Player has scored only 1 touchdown.

I want to get the Teams that have a player named Joe that has scored at least 3 Touchdowns.

models.Team.objects.filter(Q(player__name="Joe", player__touchdowns__gte=3)).count()

That returns One, as it should.The negation of that should return 9 (The other 9 teams that don't have a player named Joe that has at least 3 Touchdowns):

models.Team.objects.filter(~Q(player__name="Joe", player__touchdowns__gte=3)).count()

instead returns any team where everyone on that team has less than 3 Touchdowns (8).

Where am I going wrong? Please note that our actual application of this is much more complicated, so we NEED to use Q objects with negation, we cannot use Exclude.

Upvotes: 3

Views: 538

Answers (3)

Laurent Lyaudet
Laurent Lyaudet

Reputation: 958

I had a similar case and found that, using boolean logic to put the negations on the leaves of the tree solved the issue. I made a Django snippet here : https://djangosnippets.org/snippets/10866/ Here is a copy of my code :

def put_Q_negations_to_leaves(
    query_filter: Q,
    negate: bool = False,
    first_call: bool = True,
    debug: bool = False,
):
    negate_below = (negate != query_filter.negated)  # XOR
    if debug:
        logger.info(
            f"put_Q_negations_to_leaves() query_filter:{query_filter}"
            f" negate:{negate} negate_below:{negate_below}"
        )
    true_kwargs = {
        "_connector": query_filter.connector,
        "_negated": False,
    }
    new_children = []
    for child in query_filter.children:
        if debug:
            logger.info(child.__repr__())
        if not isinstance(child, Q):
            if negate_below:
                new_child = ~Q(child)
            else:
                new_child = child
        else:
            new_child = put_Q_negations_to_leaves(child, negate=negate_below, first_call=False)
        if debug:
            logger.info(new_child.__repr__())
        new_children.append(new_child)
    if len(new_children) == 1:
        # One child
        if isinstance(new_children[0], Q) or first_call == False:
            # Double negation canceled out if possible
            return new_children[0]
        else:
            true_kwargs["_negated"] = negate_below
    if negate_below:
        if true_kwargs["_connector"] == Q.AND:
            true_kwargs["_connector"] = Q.OR
        else:
            true_kwargs["_connector"] = Q.AND
    return Q(*new_children, **true_kwargs)

To make this snippet works in all cases, it is necessary to change the following lines :

if negate_below:
    new_child = ~Q(child)

You must handle all negation of field lookups : https://docs.djangoproject.com/en/4.0/ref/models/querysets/#field-lookups-1 with string manipulation on the first element of the tuple.

For that, you can look at this answer on StackOverflow : How do I do a not equal in Django queryset filtering? https://stackoverflow.com/a/29227603/5796086

However, for most uses, it will be simpler to use a SubQuery (or Exists).

Use example :

from django.db.models import Q, F

# For simplicity, and avoiding mixing args and kwargs, we only use args since :
# ("some_fk__some_other_fk__some_field", 111) arg
# is equivalent to
# some_fk__some_other_fk__some_field=111 kwarg

unmodified_filter = ~Q(
  ("some_fk__some_other_fk__some_field", 111),
  Q(("some_fk__some_other_fk__some_other_field__lt", 11))
  | ~Q(("some_fk__some_other_fk__some_yet_another_field", F("some_fk__some_yet_another_field")))
)

modified_filter = put_Q_negations_to_leaves(unmodified_filter)
print(unmodified_filter)
print(modified_filter)

This will output something that you can beautify like this:

Before:
(NOT
   (AND:
      ('some_fk__some_other_fk__some_field', 111),
      (OR:
         ('some_fk__some_other_fk__some_other_field__lt', 11),
         (NOT
            (AND: ('some_fk__some_other_fk__some_yet_another_field', F(some_fk__some_yet_another_field)))
         )
      )
   )
)

After:
(OR:
   (NOT
      (AND: ('some_fk__some_other_fk__some_field', 111))
   ),
   (AND:
      (NOT
         (AND: ('some_fk__some_other_fk__some_other_field__lt', 11))
      ), <-- This is where negation of lookups like "lt" -> "gte" should be handled
      ('some_fk__some_other_fk__some_yet_another_field', F(some_fk__some_yet_another_field))  <-- Double negation canceled out
   )
)

Upvotes: 1

Denis SkS
Denis SkS

Reputation: 856

Сheck queries in database with Q and ~Q:

>>> print models.Team.objects.filter(Q(player__name="Joe", player__touchdowns__gte=3)).query
>>> print models.Team.objects.filter(~Q(player__name="Joe", player__touchdowns__gte=3)).query

And test it with two Q objects:

>>> print models.Team.objects.filter(Q(player__name="Joe") & Q(player__touchdowns__gte=3)).query
>>> print models.Team.objects.filter(~Q(player__name="Joe") & ~Q(player__touchdowns__gte=3)).query

Upvotes: 0

Chris Pratt
Chris Pratt

Reputation: 239380

The best way to ferret out why these differences occur is to investigate the queries that are generated: django-debug-toolbar comes with a debugsqlshell command that prints the actual query sent to the database after any use of the Django queryset API. For these tests, I used the User model with a join on Group. I too noticed different counts for selected objects, so on the face it seems like a good correlation to your use-case.

User.objects.filter(~Q(username='jdoe', groups__name='Awesome Group'))

SELECT "auth_user"."id",
       "auth_user"."username",
       "auth_user"."first_name",
       "auth_user"."last_name",
       "auth_user"."email",
       "auth_user"."password",
       "auth_user"."is_staff",
       "auth_user"."is_active",
       "auth_user"."is_superuser",
       "auth_user"."last_login",
       "auth_user"."date_joined"
FROM "auth_user"
WHERE NOT ("auth_user"."username" = 'jdoe'
           AND "auth_user"."id" IN
             (SELECT U1."user_id"
              FROM "auth_user_groups" U1
              INNER JOIN "auth_group" U2 ON (U1."group_id" = U2."id")
              WHERE (U2."name" = 'Awesome Group'
                     AND U1."user_id" IS NOT NULL))) LIMIT 21

User.objects.exclude(Q(username='jdoe', groups__name='Awesome Group'))

SELECT "auth_user"."id",
       "auth_user"."username",
       "auth_user"."first_name",
       "auth_user"."last_name",
       "auth_user"."email",
       "auth_user"."password",
       "auth_user"."is_staff",
       "auth_user"."is_active",
       "auth_user"."is_superuser",
       "auth_user"."last_login",
       "auth_user"."date_joined"
FROM "auth_user"
INNER JOIN "auth_user_groups" ON ("auth_user"."id" = "auth_user_groups"."user_id")
INNER JOIN "auth_group" ON ("auth_user_groups"."group_id" = "auth_group"."id")
WHERE NOT (("auth_user"."username" = 'jdoe'
            AND "auth_group"."name" = 'Awesome Group')) LIMIT 21

The difference here comes in where the INNER JOIN happens. The Q object causes the INNER JOIN in the first example and then the selection with the INNER JOIN is negated because of the ~. The case of exclude, the negation happens in parallel to the INNER JOIN.

Upvotes: 3

Related Questions