Edamame
Edamame

Reputation: 25366

Impala: change the column type prior to perform the aggregation function for group by

I have a table, my_table:

transaction_id    |   money     |  team
--------------------------------------------
    1             |   10        |   A
    2             |   20        |   B
    3             |   null      |   A
    4             |   30        |   A
    5             |   16        |   B
    6             |   12        |   B

When I group by team, I can compute max, min through query:

select team, max(money), min(money) from my_table group by team

However, I can't do avg and sum because there is null. i.e:

select team, avg(money), sum(money) from my_table group by team

would fail.

Is there a way to change the column type prior to computing the avg and sum? i.e. I want the output to be:

team   |  avg(money)   |  sum(money)
--------------------------------------
 A     |  20           |  40
 B     |  16           |  48

Thanks!

Upvotes: 0

Views: 3447

Answers (2)

Derrick Moeller
Derrick Moeller

Reputation: 4950

Per documentation provided by Cloudera your query should be working as-is. Both AVG Function and SUM Function ignore null.

SELECT team, AVG(money), SUM(money)
FROM my_table
GROUP BY team

UPDATE: Per your comment, again I'm not familiar with Impala. Presumably standard SQL will work. Your error appears to be a datatype issue.

SELECT team, AVG(CAST(money AS INT)), SUM(CAST(money AS INT))
FROM my_table
GROUP BY team

Upvotes: 1

kjmerf
kjmerf

Reputation: 4345

Just divide the sum by the count:

SELECT team, SUM(money)/COUNT(money) AS AVG, SUM(money)
FROM team
GROUP BY team

Tested here: http://sqlfiddle.com/#!9/ba381/4

Upvotes: 0

Related Questions