Matt Frei
Matt Frei

Reputation: 399

Blaze generating invalid sql for simple sql HAVING-style query?

I'm trying to perform a simple operation like a sql HAVING using Blaze. When I chain together a by operation with a filtering operation, Blaze doesn't properly chain the two commands.

Using this data in sqlite:

   index customer    item  quantity
0      0     Greg  Apples       5.0
1      1     Greg    Soup       NaN
2      2    Susan   Pears       2.0
3      3    Susan  Apples       3.0

These two operations result in invalid SQL

cnts = by(sqllitedf.customer,cnt=sqllitedf.quantity.count())
res = cnts[cnts.cnt > 1]

Here is the resulting SQL. Note that the two operations weren't correctly chained together. Instead, the COUNT function was used inappropriately.

print(compute(res))

SELECT df.customer, count(df.quantity) AS cnt 
FROM df 
WHERE count(df.quantity) > ? GROUP BY df.customer

Reproduction in Jupyter Notebook.

Upvotes: 0

Views: 46

Answers (1)

Matt Frei
Matt Frei

Reputation: 399

Answering my own question here: It looks as though this is a known issue with Blaze: https://github.com/blaze/blaze/issues/1457. Also, Blaze doesn't seem to have had much development since the Fall of 2016 so its not likely to be fixed soon. I'll be avoiding Blaze unless and until it becomes more active again.

Upvotes: 0

Related Questions