Reputation: 1930
I'm receiving this error in Hive HQL:
SELECT ce.mes AS year_month,
ce.sigladosite AS loc,
MAX(ce.consumoacumuladodomes) AS energy_est,
SUM(ce.consumo_mes) AS energy_real,
SUM(CASE WHEN MAX(ce.consumoacumuladodomes) < SUM(ce.consumo_mes) THEN 1 ELSE 0 END) AS recorrencia
FROM my_db.my_table AS ce
GROUP BY ce.sigladosite, ce.mes
LIMIT 50;
FAILED: SemanticException [Error 10128]: Line 5:14 Not yet supported place for UDAF 'MAX' (state=42000,code=10128)
I've also tried to use aliases, but it didn't work. How can I use the CASE WHEN
with the result from an aggregated column?
+----------+-------+---------------+-------------------+--+
|year_month| loc | energy_est | energy_real |
+----------+-------+---------------+-------------------+--+
| 202001 | ABC | 50 | 223470.0 |
| 202002 | ABC | 50 | 8595.0 |
| 202003 | ABC | 50 | 8595.0 |
| 202004 | ABC | 50 | 8595.0 |
| 202005 | ABC | 50 | 8595.0 |
| 202006 | ABC | 50 | 8595.0 |
| 202007 | ABC | 50 | 8595.0 |
| 202008 | ABC | 50 | 8595.0 |
| 202005 | XZY | 30 | 2993.0 |
| 202006 | XZY | 30 | 2993.0 |
| 202005 | 123 | 66 | 908.0 |
| 202009 | 123 | 91 | 908.0 |
| 202001 | GHI | 50 | 4072.0 |
| 202002 | GHI | 50 | 4072.0 |
| 202003 | GHI | 50 | 4072.0 |
| 202004 | GHI | 50 | 4072.0 |
| 202005 | GHI | 50 | 4072.0 |
| 202006 | GHI | 50 | 4072.0 |
What I want is count the recurrency of how many followed months the energy_real
was higher than energy_est
for each loc
.
Desired output
loc - recurrencies
ABC - 8 (8 followed months, starting from the higher, that energy_real was higher than energy_est`. Count goes until condition doesn't match.)
XZY - 2
123 - 2
GHI - 6
Upvotes: 0
Views: 2062
Reputation: 222462
You can't nest aggregate functions. Probably, you want two levels of aggregation:
SELECT loc, SUM(CASE WHEN energy_est < energy_real THEN 1 ELSE 0 END) AS recorrencia
FROM (
SELECT ce.mes AS year_month,
ce.sigladosite AS loc,
MAX(ce.consumoacumuladodomes) AS energy_est,
SUM(ce.consumo_mes) AS energy_real
FROM my_db.my_table AS ce
GROUP BY ce.sigladosite, ce.mes
) t
GROUP BY loc
Alternatively:
SELECT loc, COUNTT(*) AS recorrencia
FROM (
SELECT ce.mes AS year_month,
ce.sigladosite AS loc,
MAX(ce.consumoacumuladodomes) AS energy_est,
SUM(ce.consumo_mes) AS energy_real
FROM my_db.my_table AS ce
GROUP BY ce.sigladosite, ce.mes
) t
WHERE energy_est < energy_real
GROUP BY loc
Upvotes: 1