user4150760
user4150760

Reputation: 2939

Primary key requirement in raw SQL complicates the query in Django

To get max value from a simple table of values, I can write the following query in Django:

MyTable.objects.aggregate(Max('value'))

The SQL generated is : 'SELECT MAX("mytable"."value") AS "value__max" FROM "mytable"'

Now if I write the same SQL using the raw query manager:

1. MyTable.objects.raw('SELECT max(value) FROM mytable')

Django throws an error InvalidQuery: Raw query must include the primary key. This is also mentioned in Django docs: "There is only one field that you can’t leave out - the primary key field". So after adding the id field, I need GROUP BY as well. The new query becomes:

2. MyTable.objects.raw('SELECT id, max(value) FROM mytable GROUP BY id')

This doesn't give me a single max value anymore because I'm forced to use GROUP BY id. Now I need to add an ORDER BY and LIMIT statement to get the expected answer for an otherwise simple SQL statement that work.

3. MyTable.objects.raw('SELECT id, max(value) AS mv FROM mytable GROUP BY id ORDER BY mv DESC LIMIT 1')

Is there a way simplify the above query i.e. not use ORDER/LIMIT/GROUP BY (FWIW, using PosgreSQL)?

Update:

Here's a hack that'll work. I alias the max value as id to make Django happy. Is there any issue here?

MyTable.objects.raw('SELECT max(value) AS id FROM mytable')

Update 2:

Here's the query plan for the simple SQL (1) vs the complicated final one (3):

"Aggregate  (cost=5.25..5.26 rows=1 width=2) (actual time=0.155..0.155 rows=1 loops=1)"
"  ->  Seq Scan on mytable  (cost=0.00..4.60 rows=260 width=2) (actual time=0.018..0.067 rows=260 loops=1)"
"Total runtime: 0.222 ms"


"Limit  (cost=9.80..9.80 rows=1 width=6) (actual time=0.548..0.548 rows=1 loops=1)"
"  ->  Sort  (cost=9.80..10.45 rows=260 width=6) (actual time=0.545..0.545 rows=1 loops=1)"
"        Sort Key: (max(value))"
"        Sort Method: top-N heapsort  Memory: 25kB"
"        ->  HashAggregate  (cost=5.90..8.50 rows=260 width=6) (actual time=0.328..0.432 rows=260 loops=1)"
"              ->  Seq Scan on mytable  (cost=0.00..4.60 rows=260 width=6) (actual time=0.018..0.069 rows=260 loops=1)"
"Total runtime: 0.638 ms"

P.S. The actual query is more complicated (somewhat related to this answer : https://dba.stackexchange.com/a/86404/52114)

Upvotes: 6

Views: 8759

Answers (4)

NegaOverflow
NegaOverflow

Reputation: 138

I just had same issue, @Tinashe Robert's not working for me. May I share my solution(Python 3.7.7, Django 3.0.5):

PS: Yes, @catavaran's answer is native Python, but if one really want to use RAW?

  • Firstly, primary_key needs to be appointed to one column in MySQL DB field, i.e.:
 class MyTable(model.Model):
     id = models.AutoField(primary_key=True)
     value = models.IntegerField(10)
     ... 

with API shell:

>>> MyTable.objects.raw('SELECT id, max(value) as mx from MyTable')[0].mx

or simpler solution:

>>> from django.db.models import Max
>>> MyTable.objects.all().aggregate(Max('value'))

Upvotes: 1

Tinashe Robert
Tinashe Robert

Reputation: 711

U can use

ModelName.objects.raw('SELECT 1 as id , max(value) FROM mytable')

Upvotes: 8

Lajos Arpad
Lajos Arpad

Reputation: 77012

I would do something like:

select id, value from mytable order by value desc limit 1

Upvotes: 0

catavaran
catavaran

Reputation: 45585

You should use custom SQL instead of Manager.raw() method:

from django.db import connection

cursor = connection.cursor()
cursor.execute('SELECT max(value) FROM mytable')
max_value = cursor.fetchone()[0]

Upvotes: 15

Related Questions