Saqib Ali
Saqib Ali

Reputation: 12545

Can a Django Queryset be created to do this complex query?

Assume that I have the following Django class:

class MyModel(models.Model):
    a = models.IntegerField()
    created_ts = models.DateTimeField(default=datetime.utcnow, editable=False)

    def __str__(self):
        return "<%s %s>" % (
            self.__class__.__name__,
            "; ".join(
                [
                    "ID: %s" % self.pk,
                    "a: %s" % self.a,
                    "created_ts: %s" % self.created_ts,
                ]
            )
        )

I would like to find the instances of MyModel with the latest created_ts for each distinct value of a. Can I do so with a single QuerySet? If so, what is that QuerySet? If not, what is the most efficient way to get that result?

In the end, I want to have Integer/MyModel-Instance pairs. The answer should look something approximately like this:

{
    1: <MyModel ID: 1; a: 1; created_ts: 2004-11-08 06:01:00>,
    5: <MyModel ID: 2; a: 5; created_ts: 2004-11-05 08:01:32>,
    3: <MyModel ID: 3; a: 3; created_ts: 2004-11-04 11:01:42>,
    0: <MyModel ID: 4; a: 0; created_ts: 2004-11-03 06:12:10>,
}

Upvotes: 2

Views: 659

Answers (5)

R&#233;gis B.
R&#233;gis B.

Reputation: 10588

Your question is an almost-exact duplicate of your previous question: How to make Django Queryset that selects records with max value within a group

I will thus answer using part of my former proposed solution:

MyClass.objects.filter(
    created_ts__in=MyClass.objects.values(
        "a"
    ).annotate(
        created_ts=models.Max(
            "created_ts"
        )
    ).values_list("created_ts", flat=True)
)

Note that this requires only a single SQL request, as you can see by printing len(django.db.connection.queries) before and after the query.

However, note that the latter solution only works if your created_ts attribute is guaranteed to be unique, which might not be your case.

Upvotes: 0

theShobster
theShobster

Reputation: 66

From https://docs.djangoproject.com/en/1.8/ref/models/querysets/#queryset-api, since the result of a queryset is a queryset, you should be able to chain like this:

MyModel.objects.order_by('created_ts').distinct('a')

Upvotes: 1

Jared Mackey
Jared Mackey

Reputation: 4158

This will work if you are on Postgres. Will not work if you are on MySQL or sqlite.

MyModel.objects.order_by('a', '-created_ts').distinct('a')

Edit: Whoops just seen you are on MySQL.

This Will help you

from django.db.models import Count, Max
MyModel.objects.values('a').annotate(count=Count("a"),latest_date=Max('created_ts'))

Data in table

    a              created_ts
    -             -----------
    1 2015-09-08 20:05:51.144321+00:00
    1 2015-09-08 20:08:40.687936+00:00
    3 2015-09-08 20:08:58.472077+00:00
    2 2015-09-08 20:09:08.493748+00:00
    2 2015-09-08 20:10:20.906069+00:00

Output

[
    {'count': 2, 'latest_date': datetime.datetime(2015, 9, 8, 20, 8, 40, 687936, tzinfo=<UTC>), 'a': 1},
    {'count': 2, 'latest_date': datetime.datetime(2015, 9, 8, 20, 10, 20, 906069, tzinfo=<UTC>), 'a': 2},
    {'count': 1, 'latest_date': datetime.datetime(2015, 9, 8, 20, 8, 58, 472077, tzinfo=<UTC>), 'a': 3}
]

Upvotes: 3

James Lin
James Lin

Reputation: 26528

Not the most direct answer, but hope it helps:

Then you can use raw queries

MyModel.objects.raw("SELECT * FROM mymodel main INNER JOIN (SELECT max(created_ts) as max_ts, a FROM mymodel GROUP BY a) sub on main.a = sub.a  AND main.created_ts = sub.max_ts")

I haven't run the query myself, so it might have syntax errors, but you get the idea.

Upvotes: 3

Ben Hare
Ben Hare

Reputation: 4415

Not sure how to do it with a single queryset, but if you're ok with doing more queries at the expense of not using raw you could do

from django.db.models import Max
max_ts_queryset = MyModel.objects.values('a').order_by('a').annotate(max_ts=Max('created_ts'))

to get the max ts for each value of a, then just loop over that with something like

final_list = {}
for obj in max_ts_queryset:
    final_list[obj['a']] = MyModel.objects.get(a=obj['a'], created_ts=obj['max_ts']

return final_list

Upvotes: 1

Related Questions