user524824
user524824

Reputation:

Django Orm: Custom Select column with aggregate functions for grouped values

My postgres database table looks (simplified) like this: timeframe::timestamp, value::integer.

I have a grouped query (grouped timeranges: 1 hour, 1 day, a week, etc) and I have different aggregate functions.

An example query looks like this in sql:

SELECT
    date_trunc(%s, timeframe),
    SUM(CASE WHEN metric = 'visitors' THEN value ELSE 0 END) / NULLIF(SUM(CASE WHEN metric = 'total' THEN value ELSE 0 END), 0) as value
FROM aggregation_metrichour
GROUP BY date_trunc(%s, timeframe)

I could use raw queries, but I need the orm to dynamically filter for specific dateranges, some other information and do authorization.

The only part of my query that will change is the way the value is computer, I use here different things like sum of products, avg and sum.

I tried to built this with the orm, but failed.

What I want to do is add a custom select column to my django queryset that includes some fancy raw calculations with aggregate functions. The best way for me would be to just take a string with the calculation and add it to the query set.

This is what I have done so far:

simple with default aggregate function

result = MetricHour.objects \
    .extra(select={'date': 'date_trunc(%s, timeframe)'}, select_params=[interval]) \
    .values('date') \
    .annotate(value=Sum('value'))

 print 'simple', result.query
 # simple SELECT (date_trunc(day, timeframe)) AS "date",
 # SUM("aggregation_metrichour"."value") AS "value" FROM
 # "aggregation_metrichour" GROUP BY (date_trunc(day, timeframe))

using extra select

result = MetricHour.objects \
    .extra(select={'date': 'date_trunc(%s, timeframe)'}, select_params=[interval]) \
    .extra(select={'value': "SUM(CASE WHEN metric = 'visitors' THEN value ELSE 0 END) / NULLIF (SUM(value), 0)"}) \
    .values('date', 'value')

print 'extra-select', result.query # wont add a group by
# extra-select SELECT (date_trunc(day, timeframe)) AS "date",
# (SUM(CASE WHEN metric = 'visitors' THEN value ELSE 0 END) /
# NULLIF (SUM(value), 0)) AS "value" FROM "aggregation_metrichour"

custom aggregate function

Then I found this stackoverflow question on how to write your own aggregate functions. However the code in django changed I think. I now need to set a name as a string in the VisitorRate (models.Aggregate) that I don't know how to add new custom types

class VisitorRateSql(models.sql.aggregates.Sum):
    sql_template = "SUM(CASE WHEN metric = visitors' THEN value ELSE 0 END) / NULLIF (SUM(value), 0)"

class VisitorRate(models.Sum):
    name = 'Sum'
    sql = VisitorRateSql
    def add_to_query(self, query, alias, col, source, is_summary):
            aggregate = VisitorRateSql(col,
                                       source=source,
                                       is_summary=is_summary,
                                       **self.extra)
            query.aggregates[alias] = aggregate


result = MetricHour.objects \
    .extra(select={'date': 'date_trunc(%s, timeframe)'}, select_params=[interval]) \
    .values('date') \
    .annotate(value=VisitorRate('value'))

print "Annotate Class", result.query
# Annotate Class SELECT (date_trunc(day, timeframe)) AS "date",
# SUM(CASE WHEN metric = visitors' THEN value ELSE 0 END) /
# NULLIF (SUM(value), 0) AS "value" FROM "aggregation_metrichour"
# GROUP BY (date_trunc(day, timeframe))

Update: I did not fully understand the models.Aggregate function This looks way better. However I do need to simplify this for my use case. I want to give the sql_template directly to the annotate function

I want to know how I can easily add a new custom select query to my grouped query! Any help is appreciated!

Upvotes: 1

Views: 3280

Answers (1)

user524824
user524824

Reputation:

I found a nice solution with a parametrized class that works as an Aggreation Function:

def custom_aggregation(select_query):
    class SqlAggregate(models.sql.aggregates.Aggregate):
        sql_function = ''
        sql_template = select_query

    class VisitorRate(models.Aggregate):
        sql = SqlAggregate
        def add_to_query(self, query, alias, col, source, is_summary):
            aggregate = self.sql(col,
                                 source=source,
                                 is_summary=is_summary,
                                 **self.extra)
            query.aggregates[alias] = aggregate

    return VisitorRate


    aggregate_query = "SUM(CASE WHEN metric = visitors' THEN value ELSE 0 END) / NULLIF (SUM(value), 0)"
    AggregationFunction = custom_aggregation(aggregate_query)

    result = MetricHour.objects \
        .extra(select={'date': 'date_trunc(%s, timeframe)'}, select_params=[interval]) \
        .values('date') \
        .annotate(value=AggregationFunction('value'))

Upvotes: 2

Related Questions