purefanatic
purefanatic

Reputation: 1023

How to query tuples of columns in Django database queries?

I have some table ports(switch_ip, slot_number, port_number, many, more, columns) and would like to achieve the following PostgreSQL query using Django:

SELECT switch_ip, array_agg((slot_number, port_number, many, more, columns) ORDER BY slot_number, port_number) info
FROM ports
GROUP BY switch_ip
ORDER BY switch_ip

Using django.contrib.postgres.aggregates here's what I got so far:

Port.objects \
    .values('switch_ip') \
    .annotate(
        info=ArrayAgg('slot_number', ordering=('slot_number', 'port_number'))
    ) \
    .order_by('switch_ip')

I am unable to include more than one column in the ArrayAgg. None of ArrayAgg(a, b, c), ArrayAgg((a, b, c)), ArrayAgg([a, b, c]) seem to work. A workaround could involve separate ArrayAggs for each column and each with the same ordering. I would despise this because I have many columns. Is there any nicer workaround, possibly more low-level?

I suspect this is no issue with ArrayAgg itself but rather with tuple expressions in general. Is there any way to have tuples at all in Django queries? For example, what would be the corresponding Django of:

SELECT switch_ip, (slot_number, port_number, many, more, columns) info
FROM ports

If this is not yet possible in Django, how feasible would it be to implement?

Upvotes: 0

Views: 2721

Answers (2)

Anton M.
Anton M.

Reputation: 395

I have spent lot of time searching for a working solution and here is a full recipe with code example.

  1. You need to define Array "function" with square brackets in template
from django.db.models.expressions import Func

class Array(Func):
    template = '%(function)s[%(expressions)s]'
    function = 'ARRAY'
  1. You need to define output field format (it must be array of some django field). For example an array of strings
from django.contrib.postgres.fields import ArrayField
from django.db.models.fields import CharField

out_format = ArrayField(CharField(max_length=200))
  1. Finally make an ArrayAgg expression
from django.db.models import F

annotate = {'2-fields': ArrayAgg(Array(F('field1'), F('field2'), output_field=out_format), distinct=True) }
model.objects.all().annotate(**annotate)
  1. (Optional) If field1 or field2 are not CharFields, you may include Cast as an argument of Array
from django.db.models.functions import Cast


annotate = {'2-fields': ArrayAgg(Array(Cast(F('field1'), output_field=CharField(max_length=200)), F('field2'), output_field=out_format), distinct=True) }

Upvotes: 4

purefanatic
purefanatic

Reputation: 1023

Having done a bit more research I guess one could add the missing tuple functionality as follows:

  1. Create a new model field type named TupleField. The implementation might look kind of similar to django.contrib.postgres.fields.ArrayField. TupleField would be rather awkward because I don't think any RDBMS allows for composite types to be used as column types so usage of TupleField would be limited to (possibly intermediate?) query results.
  2. Create a new subclass of django.db.models.Expression which wraps multiple expressions on its own (like Func in general, so looking at Func's implementation might be worthwile) and evaluates to a TupleField. Name this subclass TupleExpression for example.

Then I could simply annotate with ArrayAgg(TupleExpression('slot_number', 'port_number', 'many', 'more', 'columns'), ordering=('slot_number', 'port_number')) to solve my original problem. This would annotate each switch_ip with correctly-ordered arrays of tuples where each tuple represents one switch port.

Upvotes: 0

Related Questions