bb89
bb89

Reputation: 803

Django aggregate count of records per day

I've got a django app that is doing some logging. My model looks like this:

class MessageLog(models.Model):
    logtime = models.DateTimeField(auto_now_add=True)
    user = models.CharField(max_length=50)
    message = models.CharField(max_length=512)

What a want to do is get the average number of messages logged per day of the week so that I can see which days are the most active. I've managed to write a query that pulls the total number of messages per day which is:

for i in range(1, 8):
    MessageLog.objects.filter(logtime__week_day=i).count()

But I'm having trouble calculating the average in a query. What I have right now is:

for i in range(1, 8):
    MessageLog.objects.filter(logtime__week_day=i).annotate(num_msgs=Count('id')).aggregate(Avg('num_msgs'))

For some reason this is returning 1.0 for every day though. I looked at the SQL it is generating and it is:

SELECT AVG(num_msgs) FROM (
SELECT 
`myapp_messagelog`.`id` AS `id`, `myapp_messagelog`.`logtime` AS `logtime`, 
`myapp_messagelog`.`user` AS `user`, `myapp_messagelog`.`message` AS `message`, 
COUNT(`myapp_messagelog`.`id`) AS `num_msgs` 
FROM `myapp_messagelog` 
WHERE DAYOFWEEK(`myapp_messagelog`.`logtime`) = 1 
GROUP BY `myapp_messagelog`.`id` ORDER BY NULL
) subquery

I think the problem might be coming from the GROUP BY id but I'm not really sure. Anyone have any ideas or suggestions? Thanks in advance!

Upvotes: 9

Views: 12536

Answers (3)

ahprosim
ahprosim

Reputation: 358

queryset.extra(select={'day': 'date(logtime)'}).values('day').order_by('-day').annotate(Count('id'))

Upvotes: 2

Sectio Aurea
Sectio Aurea

Reputation: 393

I do something similar with a datetime field, but annotating over extra values does work for me. I have a Record model with a datetime field "created_at" and a "my_value" field I want to get the average for.

from django.db.models import Avg

qs = Record.objects.extra({'created_day':"date(created_at)"}).\
    values('created_day').\
    annotate(count=Avg('my_value'))

The above will group by the day of the datetime value in "created_at" field.

Upvotes: 5

Peter DeGlopper
Peter DeGlopper

Reputation: 37364

The reason your listed query always gives 1 is because you're not grouping by date. Basically, you've asked the database to take the MessageLog rows that fall on a given day of the week. For each such row, count how many ids it has (always 1). Then take the average of all those counts, which is of course also 1.

Normally, you would need to use a values clause to group your MessageLog rows prior to your annotate and aggregate parts. However, since your logtime field is a datetime rather than just a date, I am not sure you can express that directly with Django's ORM. You can definitely do it with an extra clause, as shown here. Or if you felt like it you could declare a view in your SQL with as much of the aggregating and average math as you liked and declare an unmanaged model for it, then just use the ORM normally.

So an extra field works to get the total number of records per actual day, but doesn't handle aggregating the average of the computed annotation. I think this may be sufficiently abstracted from the model that you'd have to use a raw SQL query, or at least I can't find anything that makes it work in one call.

That said, you already know how you can get the total number of records per weekday in a simple query as shown in your question.

And this query will tell you how many distinct date records there are on a given weekday:

MessageLog.objects.filter(logtime__week_day=i).dates('logtime', day').count()

So you could do the averaging math in Python instead, which might be simpler than trying get the SQL right.

Alternately, this query will get you the raw number of messages for all weekdays in one query rather than a for loop:

MessageLog.objects.extra({'weekday': "dayofweek(logtime)"}).values('weekday').annotate(Count('id'))

But I haven't been able to get a nice query to give you the count of distinct dates for each weekday annotated to that - dates querysets lose the ability to handle annotate calls, and annotating over an extra value doesn't seem to work either.

This has been surprisingly tricky, given that it's not that hard a SQL expression.

Upvotes: 10

Related Questions