targhs
targhs

Reputation: 1797

Django ORM remove unwanted Group by when annotate multiple aggregate columns

I want to create a query something like this in django ORM.

SELECT COUNT(CASE WHEN myCondition THEN 1 ELSE NULL end) as numyear
FROM myTable

Following is the djang ORM query i have written

year_case = Case(When(added_on__year = today.year, then=1), output_field=IntegerField())

qs = (ProfaneContent.objects
                    .annotate(numyear=Count(year_case))
                    .values('numyear'))

This is the query which is generated by django orm.

SELECT COUNT(CASE WHEN "analyzer_profanecontent"."added_on" BETWEEN 2020-01-01 00:00:00+00:00 AND 2020-12-31 23:59:59.999999+00:00 THEN 1 ELSE NULL END) AS "numyear" FROM "analyzer_profanecontent" GROUP BY "analyzer_profanecontent"."id"

All other things are good, but django places a GROUP BY at the end leading to multiple rows and incorrect answer. I don't want that at all. Right now there is just one column but i will place more such columns.

EDIT BASED ON COMMENTS I will be using the qs variable to get values of how my classifications have been made in the current year, month, week.

UPDATE On the basis of comments and answers i am getting here let me clarify. I want to do this at the database end only (obviously using Django ORM and not RAW SQL). Its a simple sql query. Doing anything at Python's end will be inefficient since the data can be too large. Thats why i want the database to get me the sum of records based on the CASE condition. I will be adding more such columns in the future so something like len() or .count will not work.

I just want to create the above mentioned query using Django ORM (without an automatically appended GROUP BY).

Upvotes: 7

Views: 5329

Answers (4)

hynekcer
hynekcer

Reputation: 15558

If you need to summarize only to one row then you should to use an .aggregate() method instead of annotate().

result = ProfaneContent.objects.aggregate(
    numyear=Count(year_case),
    # ... more aggregated expressions are possible here
)

You get a simple dictionary of result columns:

>>> result
{'numyear': 7, ...}

The generated SQL query is without groups, exactly how required:

SELECT
  COUNT(CASE WHEN myCondition THEN 1 ELSE NULL end) as numyear
  -- and more possible aggregated expressions
FROM myTable

Upvotes: 3

datosula
datosula

Reputation: 1596

When using aggregates in annotations, django needs to have some kind of grouping, if not it defaults to primary key. So, you need to use .values() before .annotate(). Please see django docs.

But to completely remove group by you can use a static value and django is smart enough to remove it completely, so you get your result using ORM query like this:

year_case = Case(When(added_on__year = today.year, then=1), output_field=IntegerField())

qs = (ProfaneContent.objects
                    .annotate(dummy_group_by = Value(1))
                    .values('dummy_group_by')
                    .annotate(numyear=Count(year_case))
                    .values('numyear'))

Upvotes: 35

Ryan David Hocking
Ryan David Hocking

Reputation: 127

This is how I would write it in SQL.

SELECT SUM(CASE WHEN myCondition THEN 1 ELSE 0 END) as numyear
FROM myTable

SELECT 
    SUM(CASE WHEN "analyzer_profanecontent"."added_on" 
                 BETWEEN 2020-01-01 00:00:00+00:00 
                     AND 2020-12-31 23:59:59.999999+00:00 
             THEN 1 
             ELSE 0 
         END) AS "numyear" 
FROM "analyzer_profanecontent" 
GROUP BY "analyzer_profanecontent"."id"

If you intend to use other items in the SELECT clause I would recommend using a group by as well which would look like this:

SELECT SUM(CASE WHEN myCondition THEN 1 ELSE 0 END) as numyear
FROM myTable
GROUP BY SUM(CASE WHEN myCondition THEN 1 ELSE 0 END)

Upvotes: -2

Okkie
Okkie

Reputation: 335

What about a list comprehension:

# get all the objects
profane = ProfaneContent.objects.all()

# Something like this 
len([pro for pro in profane if pro.numyear=today.year])

if the num years are equal it will add it to the list, so at the and you can check the len()

to get the count

Hopefully this is helpfull!

Upvotes: -2

Related Questions