Henrique Branco
Henrique Branco

Reputation: 1930

Error in Hive HQL: Not yet supported place for UDAF 'MAX'

I'm receiving this error in Hive HQL:

SELECT ce.mes AS year_month, 
       ce.sigladosite AS loc, 
       MAX(ce.consumoacumuladodomes) AS energy_est, 
       SUM(ce.consumo_mes) AS energy_real,
       SUM(CASE WHEN MAX(ce.consumoacumuladodomes) < SUM(ce.consumo_mes) THEN 1 ELSE 0 END) AS recorrencia
FROM my_db.my_table AS ce
GROUP BY ce.sigladosite, ce.mes
LIMIT 50;
FAILED: SemanticException [Error 10128]: Line 5:14 Not yet supported place for UDAF 'MAX' (state=42000,code=10128)

I've also tried to use aliases, but it didn't work. How can I use the CASE WHEN with the result from an aggregated column?

+----------+-------+---------------+-------------------+--+
|year_month| loc   | energy_est    | energy_real       |
+----------+-------+---------------+-------------------+--+
| 202001   | ABC   | 50            | 223470.0          |
| 202002   | ABC   | 50            | 8595.0            |
| 202003   | ABC   | 50            | 8595.0            |
| 202004   | ABC   | 50            | 8595.0            |
| 202005   | ABC   | 50            | 8595.0            |
| 202006   | ABC   | 50            | 8595.0            |
| 202007   | ABC   | 50            | 8595.0            |
| 202008   | ABC   | 50            | 8595.0            |
| 202005   | XZY   | 30            | 2993.0            |
| 202006   | XZY   | 30            | 2993.0            |
| 202005   | 123   | 66            | 908.0             |
| 202009   | 123   | 91            | 908.0             |
| 202001   | GHI   | 50            | 4072.0            |
| 202002   | GHI   | 50            | 4072.0            |
| 202003   | GHI   | 50            | 4072.0            |
| 202004   | GHI   | 50            | 4072.0            |
| 202005   | GHI   | 50            | 4072.0            |
| 202006   | GHI   | 50            | 4072.0            |

What I want is count the recurrency of how many followed months the energy_real was higher than energy_est for each loc.

Desired output

loc - recurrencies
ABC - 8 (8 followed months, starting from the higher, that energy_real was higher than energy_est`. Count goes until condition doesn't match.)
XZY - 2
123 - 2
GHI - 6

Upvotes: 0

Views: 2062

Answers (1)

GMB
GMB

Reputation: 222462

You can't nest aggregate functions. Probably, you want two levels of aggregation:

SELECT loc, SUM(CASE WHEN energy_est <  energy_real THEN 1 ELSE 0 END) AS recorrencia
FROM (
    SELECT ce.mes AS year_month, 
       ce.sigladosite AS loc, 
       MAX(ce.consumoacumuladodomes) AS energy_est, 
       SUM(ce.consumo_mes) AS energy_real
    FROM my_db.my_table AS ce
    GROUP BY ce.sigladosite, ce.mes
) t
GROUP BY loc

Alternatively:

SELECT loc, COUNTT(*) AS recorrencia
FROM (
    SELECT ce.mes AS year_month, 
       ce.sigladosite AS loc, 
       MAX(ce.consumoacumuladodomes) AS energy_est, 
       SUM(ce.consumo_mes) AS energy_real
    FROM my_db.my_table AS ce
    GROUP BY ce.sigladosite, ce.mes
) t
WHERE energy_est <  energy_real
GROUP BY loc

Upvotes: 1

Related Questions