user173703
user173703

Reputation:

Django ORM: Getting rows based on max value of a column

I have a class Marketorders which contains information about single market orders and they are gathered in snapshots of the market (represented by class Snapshot). Each order can appear in more than one snapshot with the latest row of course being the relevant one.

class Marketorders(models.Model):
    id = models.AutoField(primary_key=True)
    snapid = models.IntegerField()
    orderid = models.IntegerField()
    reportedtime = models.DateTimeField(null=True, blank=True)
    ...


class Snapshot(models.Model):
    id = models.IntegerField(primary_key=True)
    ...

What I'm doing is getting all of the orders across several snapshots for processing, but I want to include only the most recent row for each order. In SQL I would simply do:

SELECT m1.* FROM marketorders m1 WHERE reportedtime = (SELECT max(reportedtime)  
FROM marketorders m2 WHERE m2.orderid=m1.orderid);

or better yet with a join:

SELECT m1.* FROM marketorders m1 LEFT JOIN marketorders m2 ON 
m1.orderid=m2.orderid AND m1.reportedtime < m2.reportedtime 
WHERE m2.orderid IS NULL;

However, I just can't figure out how to do this with Django ORM. Is there any way to accomplish this without raw SQL?

EDIT: Just to clarify the problem. Let's say we have the following marketorders (leaving out everything unimportant and using only orderid, reportedtime):

1, 09:00:00
1, 10:00:00
1, 12:00:00
2, 09:00:00
2, 10:00:00

How do I get the following set with the ORM?

1, 12:00:00
2, 10:00:00

Upvotes: 3

Views: 7357

Answers (2)

Botond B&#233;res
Botond B&#233;res

Reputation: 16683

If I understood right you need a list of Marketorder objects that contains each Marketorder with highest reportedtime per orderid

Something like this should work (disclaimer: didn't test it directly):

m_orders = Marketorders.objects.filter(id__in=(
    Marketorders.objects
        .values('orderid')
        .annotate(Max('reportedtime'))
        .values_list('id', flat=True)
))

For documentation check:

http://docs.djangoproject.com/en/dev/topics/db/aggregation/

Edit: This should get a single Marketorder with highest reportedtime for a specific orderid

order = (
    Marketorders.objects
        .filter(orderid=the_orderid)
        .filter(reportedtime=(
            Marketorders.objects
                .filter(orderid=the_orderid)
                .aggregate(Max('reportedtime'))
                ['reportedtime__max']
        ))
)

Upvotes: 2

vikingosegundo
vikingosegundo

Reputation: 52237

Do you have a good reason why you don't use ForeignKey or (in your case better) ManyToManyField. These fields represent the relational structure of ur models.

Furthermore it is not necessary to declare an pk-field id. if no pk is defined, django adds id.

The code below allow orm-queries like this:

   m1 = Marketorder()
   m1.save() # important: primary key will be added in db
   s1 = Snapshot()
   s2 = Snapshot()
   s1.save()
   s2.save()
   m1.snapshots.add(s1)
   m1.snapshots.add(s2)
   m1.snapshots.all()
   m1.snapshots.order_by("id") # snapshots in relations with m1 
                               # ordered by id, that is added automatically
   s1.marketorder_set.all()    # reverse

so for your query:

snapshot = Snapshot.objects.order_by('-id')[0] # order desc, pick first
marketorders = snapshot.marketorder_set.all()  # all marketorders in snapshot

or compact:

marketorders = Snapshot.objects.order_by('-id')[0].marketorder_set.all()

models.py:

class Snapshot(models.Model):
    name = models.CharField(max_length=100)

class Marketorder(models.Model):
    snapshots = models.ManyToManyField(Snapshot)
    reportedtime = models.DateTimeField(auto_now= True)

By convention all model classes name are singular. Django makes it plural in different places automatically.

more on queries (filtering, sorting, complex lookups). a must-read.

Upvotes: -1

Related Questions