Reputation: 399
I'm trying to perform a simple operation like a sql HAVING using Blaze. When I chain together a by operation with a filtering operation, Blaze doesn't properly chain the two commands.
Using this data in sqlite:
index customer item quantity
0 0 Greg Apples 5.0
1 1 Greg Soup NaN
2 2 Susan Pears 2.0
3 3 Susan Apples 3.0
These two operations result in invalid SQL
cnts = by(sqllitedf.customer,cnt=sqllitedf.quantity.count())
res = cnts[cnts.cnt > 1]
Here is the resulting SQL. Note that the two operations weren't correctly chained together. Instead, the COUNT function was used inappropriately.
print(compute(res))
SELECT df.customer, count(df.quantity) AS cnt
FROM df
WHERE count(df.quantity) > ? GROUP BY df.customer
Reproduction in Jupyter Notebook.
Upvotes: 0
Views: 46
Reputation: 399
Answering my own question here: It looks as though this is a known issue with Blaze: https://github.com/blaze/blaze/issues/1457. Also, Blaze doesn't seem to have had much development since the Fall of 2016 so its not likely to be fixed soon. I'll be avoiding Blaze unless and until it becomes more active again.
Upvotes: 0