Reputation: 165
Let's say that I have a table with a column, that has some integer values and I want to calculate the percentage of values that are over 200 for that column.
Here's the kicker, I would prefer if I could do it inside one query that I could use group_by on.
results = db.session.query(
ClassA.some_variable,
label('entries', func.count(ClassA.some_variable)),
label('percent', *no clue*)
).filter(ClassA.value.isnot(None)).group_by(ClassA.some_variable)
Alternately it would be okay thought not prefered to do the percentage calculation on the client side, something like this.
results = db.session.query(
ClassA.some_variable,
label('entries', func.count(ClassA.some_variable)),
label('total_count', func.count(ClassA.value)),
label('over_200_count', func.count(ClassA.value > 200)),
).filter(ClassA.value.isnot(None)).group_by(ClassA.some_variable)
But I obviously can't filter within the count statemenet, and I can't apply the filter at the end of the query, since if I apply the > 200 constraint at the end, total_count wouldn't work.
Using RAW SQL is an option too, it doesn't have to be Sqlalchemy
Upvotes: 1
Views: 4998
Reputation: 52939
MariaDB unfortunately does not support the aggregate FILTER clause, but you can work around that using a CASE expression or NULLIF, since COUNT returns the count of non-null values of given expression:
from sqlalchemy import case
...
func.count(case([(ClassA.value > 200, 1)])).label('over_200_count')
With that in mind you can calculate the percentage simply as
(func.count(case([(ClassA.value > 200, 1)])) * 1.0 /
func.count(ClassA.value)).label('percent')
though there's that one edge: what if func.count(ClassA.value)
is 0? Depending on whether you'd consider 0 or NULL a valid return value you could either use yet another CASE expression or NULLIF:
dividend = func.count(case([(ClassA.value > 200, 1)])) * 1.0
divisor = func.count(ClassA.value)
# Zero
case([(divisor == 0, 0)],
else_=dividend / divisor).label('percent')
# NULL
(dividend / func.nullif(divisor, 0)).label('percent')
Finally, you could create a compilation extension for mysql dialect that rewrites a FILTER clause to a suitable CASE expression:
from sqlalchemy.ext.compiler import compiles
from sqlalchemy.sql.expression import FunctionFilter
from sqlalchemy.sql.functions import Function
from sqlalchemy import case
@compiles(FunctionFilter, 'mysql')
def compile_functionfilter_mysql(element, compiler, **kwgs):
# Support unary functions only
arg0, = element.func.clauses
new_func = Function(
element.func.name,
case([(element.criterion, arg0)]),
packagenames=element.func.packagenames,
type_=element.func.type,
bind=element.func._bind)
return new_func._compiler_dispatch(compiler, **kwgs)
With that in place you could express the dividend as
dividend = func.count(1).filter(ClassA.value > 200) * 1.0
which compiles to
In [28]: print(dividend.compile(dialect=mysql.dialect()))
count(CASE WHEN (class_a.value > %s) THEN %s END) * %s
Upvotes: 3