yabchexu
yabchexu

Reputation: 593

how to query the percentage of aggregate in vertica

Table product

productId type
1          A
2          A
3          A
4          B
5          B
6          C

What I want:

type     perc
A         0.5
B         0.33
C         0.17

We can write a simple query like this:

Select type, cnt/(select count(*) from product) AS perc
FROM (
select type, count(*) as cnt
from product
group by type
 ) nested

But vertica doesn't support the subselect which is not correlated

Need someone's help!

Upvotes: 1

Views: 3273

Answers (2)

kimbo305
kimbo305

Reputation: 660

You could also use analytic functions, which are messy in this application, but work:

WITH product AS (
          select 1 as productId, 'A' as type
union all select 2, 'A'
union all select 3, 'A'
union all select 4, 'B'
union all select 5, 'B'
union all select 6, 'C'
)

SELECT distinct /* distinct because analytic functions don't reduce row count like aggregate functions */
  type, count(*) over (partition by type) / count(*) over ()
FROM product;

 type |         perc         
------+----------------------
 A    | 0.500000000000000000
 B    | 0.333333333333333333
 C    | 0.166666666666666667

count(*) over (partition by type) counts each type;

count(*) over () counts over everything, so gets the total count

Upvotes: 2

mauro
mauro

Reputation: 5950

Vertica does support both correlated and non-correlated subquery even if you might have restrictions on the joining predicate.

So, your query here above just works. And - guess what - it continues to work even if you use indentation:

SQL> SELECT
         type
       , cnt/( select count (*) FROM product ) AS perc
     FROM
         ( SELECT type, count (*) as cnt
           FROM product
           GROUP BY type
         ) nested ;
 type |         perc         
------+----------------------
 C    | 0.166666666666666667
 A    | 0.500000000000000000
 B    | 0.333333333333333333
(3 rows)

Of course you can re-write it in a different way. For example:

SQL> SELECT
        a.type
      , a.cnt/b.tot as perc
    FROM
      ( SELECT type , count (*) as cnt
        FROM product
        GROUP BY type ) a
    CROSS JOIN
      ( SELECT count (*) AS tot
        FROM product ) b
    ORDER BY 1
    ;
 type |         perc         
------+----------------------
 A    | 0.500000000000000000
 B    | 0.333333333333333333
 C    | 0.166666666666666667
(3 rows)

Upvotes: 2

Related Questions