Thomas Milan
Thomas Milan

Reputation: 309

Filter list items within a Django JSONField

Here is a simplified version of a Django model I'm working on.

class Course(models.Model):
    infos = JSONField()

The infos JSONField looks like this :

infos = {
    category: "Maths",
    students: [
        {
            name: "Alice",
            result: 8
        },
        {
            name: "Bob",
            result: 12
        }
    ]
}

# (students can have 0 -> n items)

I'm trying to get a list of all students that got at least 10 on any course (result >= 10). However I'm having trouble figuring out a way to filter the QuerySet accordingly.

I'm trying to do something like this :

(Course.objects.filter(students__result__gte=10)
               .values_list('students', flat=True))

But since students is a list, I can't access directly to the result property of each item. Besides, I don't think it would exclude the "Alice" object from the results.

I'd like to get results like this :

items = [
    {
        name: "Bob",
        result: 12
    }
]

Bonus points if there is a way to link filtered students and course category :

items = [
    {
        category: "Maths",
        students: [
            {
                name: "Bob",
                result: 12
            }
        ]
    }
 ]

How can I achieve the desired result ?

Upvotes: 3

Views: 816

Answers (3)

Nikhil Khandelwal
Nikhil Khandelwal

Reputation: 348

You could try defining a custom manager for your Course Class. This would enable you to make the query you want in the Django-ORM style without using raw SQL.

PS:- I"ll edit this answer with more code and details if this suits your needs.

Upvotes: 0

Jura Brazdil
Jura Brazdil

Reputation: 1100

In case you're using a PostgreSQL database, you can switch to an HStoreField ([docs here][1]). The HStoreField will allow you to use exactly the lookup you mention (students__result__gte). Otherwise the most Django-esque way would be to create a Student model like Arakkal suggests in the comments under your question.

EDIT: This is the simplest way to be able to filter given all the maps/dicts are the same (if you really want the same structure as your JSON):

class Course(models.Model):
    pass

class Info(models.Model):
    course = models.ForeignKey(Course, related_name='infos')
    category = models.CharField(max_length=128)
    # students = models.HStoreField()  # available in postgresql

class Student(models.Model):
    """If not using PostgreSQL, also more Django-esque."""
   info = models.ForeignKey(Info, related_name='students')

   name = models.CharField(max_length=128)
   result = models.IntegerField()

I'm trying to get a list of all students that got at least 10 on any course (result >= 10)

This would be my prefered structure given your goal and my intuition about the logical relationships:

class Course(models.Model):
    students = models.ManyToManyField(Student, through=Result)
    category = models.CharField(max_length=128)

class Student(models.Model):
    name = models.CharField(max_length=128)

class Result(models.Model):
    course = models.ForeignKey(Course, on_delete=models.deletion.CASCADE)
    student = models.ForeignKey(Student, on_delete=models.deletion.CASCADE)
    grade = models.IntegerField()

That allows for: Result.objects.filter(grade__gte=10).values_list('student__name', flat=True)

Upvotes: 2

JPG
JPG

Reputation: 88459

By using raw SQL, I have managed to obtain the result.

select array_to_json(array_agg(students_array))
FROM
    course_table,
    json_array_elements(CAST(course_table.infos->>'students' as json)) students_array
WHERE
    CAST(students_array->>'result' as integer) >= 10
;

Result

sample_db=# SELECT version();
PostgreSQL 10.12 (Ubuntu 10.12-0ubuntu0.18.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0, 64-bit

sample_db=# select infos from course_table;
{"category": "Maths", "students": [{"name": "Alice", "result": 8}, {"name": "Bob", "result": 12}]}
{"category": "Science", "students": [{"name": "Jerin", "result": 8}, {"name": "George", "result": 12}]}
{"category": "Physics", "students": [{"name": "Vivek", "result": 17}, {"name": "Osama", "result": 6}]}

sample_db=# select array_to_json(array_agg(students_array))
sample_db-# FROM
sample_db-# course_table,
sample_db-#  json_array_elements(CAST(course_table.infos->>'students' as json)) students_array
sample_db-# WHERE
sample_db-#  CAST(students_array->>'result' as integer) >= 10
sample_db-# ;
[{"name": "Bob", "result": 12},{"name": "George", "result": 12},{"name": "Vivek", "result": 17}]

This raw SQL can be executed by Django as,

raw_sql = """
select 1 as id, array_to_json(array_agg(students_array)) as result
FROM
    course_table,
    json_array_elements(CAST(course_table.infos->>'students' as json)) students_array
WHERE
    CAST(students_array->>'result' as integer) >= 10
;
"""
qs = Course.objects.raw(raw_sql)
for i in qs:
    print(i.result)

Upvotes: 2

Related Questions