user3925023
user3925023

Reputation: 687

Count by criteria over partition

I have this table on a PostGre DB:

CREATE TABLE testAAA(Id integer PRIMARY KEY, datum date, COLA text, COLB text, COLC text, COLD int);

/* Create few records in this table */
INSERT INTO testAAA VALUES(1,to_date('01/01/2018','dd/mm/yyyy'),'PLANTA','VENDORA','OPA','1');
INSERT INTO testAAA VALUES(2,to_date('01/02/2018','dd/mm/yyyy'),'PLANTA','VENDORA','OPA','1');
INSERT INTO testAAA VALUES(3,to_date('01/03/2018','dd/mm/yyyy'),'PLANTA','VENDORA','OPA','1');
INSERT INTO testAAA VALUES(4,null,'PLANTA','VENDORA','OPA','1');
INSERT INTO testAAA VALUES(5,to_date('01/10/2018','dd/mm/yyyy'),'PLANTA','VENDORA','OPA','5');
INSERT INTO testAAA VALUES(6,to_date('01/10/2018','dd/mm/yyyy'),'PLANTA','VENDORA','OPA','10');
INSERT INTO testAAA VALUES(7,to_date('01/11/2018','dd/mm/yyyy'),'PLANTA','VENDORB','OPA','50');
INSERT INTO testAAA VALUES(8,to_date('01/10/2018','dd/mm/yyyy'),'PLANTA','VENDORA','OPA','10');
INSERT INTO testAAA VALUES(9,to_date('01/11/2018','dd/mm/yyyy'),'PLANTA','VENDORA','OPB','5');
COMMIT;

I do have this query:

SELECT COLA,COLB,COLC,to_char(datum,'YYYY-MM') AS datum_c,
    SUM(COLD) FILTER (WHERE (datum >= now() - interval '6 month') AND (datum <= now())) OVER (PARTITION BY COLA, COLB, COLC) / 6.0 AS AVG_6_month,
    /* TO Aggregate at month level we need to PARTITION IT */
    SUM(COLD) FILTER (WHERE datum is not null) OVER (PARTITION BY COLA,to_char(datum,'YYYY-MM')) AS TOT_ORDERED,
    SUM(COLD) FILTER (WHERE datum is not null) OVER (PARTITION BY COLA,COLB,to_char(datum,'YYYY-MM')) AS TOT_ORDERED_PLANT,
    (SUM(COLD) FILTER (WHERE datum >= now() - interval '6 month') OVER (PARTITION BY COLA, COLB, COLC) / 6.0) / (SUM(COLD) FILTER (WHERE datum is not null) OVER (PARTITION BY COLA))*100 AS Variation,
    CASE
      WHEN (SUM(COLD) FILTER (WHERE datum >= now() - interval '6 month') OVER (PARTITION BY COLA, COLB, COLC) / 6.0) / (SUM(COLD) FILTER (WHERE datum is not null) OVER (PARTITION BY COLA))*100 >= 10 THEN 'A'
     WHEN (SUM(COLD) FILTER (WHERE datum >= now() - interval '6 month') OVER (PARTITION BY COLA, COLB, COLC) / 6.0) / (SUM(COLD) FILTER (WHERE datum is not null) OVER (PARTITION BY COLA))*100 < 10  THEN 'B'
    END AS var_type
FROM testAAA;

I'd like to have the TOT COUNT for PARTITION COLA/COLB of the Variation <10. I tried to add count in the case statement:

WHEN (SUM(COLD) FILTER (WHERE datum >= now() - interval '6 month') OVER (PARTITION BY COLA, COLB, COLC) / 6.0) / (SUM(COLD) FILTER (WHERE datum is not null) OVER (PARTITION BY COLA))*100 >= 10 THEN count((SUM(COLD) FILTER (WHERE datum >= now() - interval '6 month') OVER (PARTITION BY COLA, COLB, COLC) / 6.0) / (SUM(COLD) FILTER (WHERE datum is not null) OVER (PARTITION BY COLA))*100)

But I got this error back:

42803: aggregate function calls cannot contain window function calls.

######EDITED after "The Impaler" TIP
with
x as ( -- first, compute the window functions
  select COLA,COLB,COLC,to_char(datum,'YYYY-MM') AS datum_c,
    SUM(COLD) FILTER (WHERE (datum >= now() - interval '6 month') AND (datum <= now())) OVER (PARTITION BY COLA, COLB, COLC) / 6.0 AS AVG_6_month,
    /* TO Aggregate at month level we need to PARTITION IT */
    SUM(COLD) FILTER (WHERE datum is not null) OVER (PARTITION BY COLA,to_char(datum,'YYYY-MM')) AS TOT_ORDERED,
    SUM(COLD) FILTER (WHERE datum is not null) OVER (PARTITION BY COLA,COLB,to_char(datum,'YYYY-MM')) AS TOT_ORDERED_PLANT,
    (SUM(COLD) FILTER (WHERE datum >= now() - interval '6 month') OVER (PARTITION BY COLA, COLB, COLC) / 6.0) / (SUM(COLD) FILTER (WHERE datum is not null) OVER (PARTITION BY COLA))*100 AS Variation
  from testAAA
)
select*,
  COUNT(Variation) FILTER (WHERE Variation < 10) OVER (PARTITION BY COLA, COLB) AS below10,
  COUNT(Variation) FILTER (WHERE Variation > 10) OVER (PARTITION BY COLA, COLB) AS above10
from x

Upvotes: 0

Views: 1081

Answers (1)

The Impaler
The Impaler

Reputation: 48770

This is not an answer, but a comment that would not fit in the comments section. When you have multiple window functions, and you want to operate their results (as in this case) I tend to compute them in a CTE (Common Table Expression); then, I use them as simple scalar values in the main query. That makes my life so much easier.

For example, a query like this:

select 
  sum(a) over(partition by region_id order by created_at) /
  sum(b) over(partition by region_id order by deleted_at) as score
from my_table

I would rephrase the query to a) pre-compute the "complex" values and then 2) use them in a simple query, as in:

with
x as ( -- first, compute the window functions
  select 
    sum(a) over(partition by region_id order by created_at) as sum_a
    sum(b) over(partition by region_id order by deleted_at) as sum_b
  from my_table
)
select
  sum_a,
  sum_b,
  sum_a / sum_b, -- then use them for any purposes
  sum_a * sum_b
from x

Maybe you can use this strategy to make your query look simpler so debugging it will be easier.

Upvotes: 3

Related Questions