BFF
BFF

Reputation: 386

How to use conditional group by aggregations correctly

I want to be able to count the total type of apples (organic only) from each continent, broken down by countries; including the total count if they're mixed.

For example, food item B1 is organic golden apples from the USA. Thus there should be a count of "1" golden_bag and "1" for organic. Now, A1 is also organic from Argentina - however, it has both granny and red delicious apples - thus it is counted as "1" mixed_bag and "1" for granny_bag and "1" for red_bag as well.

Finally, E1 and F1 are both fuji apples from laos, but one is organic the other isn't; so total count is 2 fuji_bag and it should have a total count of 1 for organic_fd.

Table X:
food_item | food_area | food_loc   | food_exp
A1          lxgs        argentina   1/1/20
B1          iyan        usa         5/31/21
C1          lxgs        peru        4/1/20
D1          wa8e        norway      10/1/19
E1          894a        laos        5/1/19
F1          894a        laos        9/17/19


Table Y:
food_item | organic
A1          Y
B1          Y
C1          N
D1          N
E1          Y
F1          N

Table Z:
food_item | food_type
A1          189
A1          190
B1          191
C1          189
D1          192
E1          193
F1          193

SELECT continent, country,
      SUM(organic)  AS organic_fd, SUM(Granny) AS granny_bag,
      SUM(Red_delc) AS red_bag,    SUM(Golden) AS golden_bag,
      SUM(Gala)     AS gala_bag,   SUM(Fuji)   AS fuji_bag,
      SUM(CASE WHEN Granny + Red_delc + Golden + Gala + Fuji > 1 THEN 1  ELSE 0 END) AS mixed_bag     
FROM (SELECT (CASE SUBSTR (x.food_area, 4, 1)
              WHEN 's' THEN 'SA' WHEN 'n' THEN 'NA'
              WHEN 'e' THEN 'EU' WHEN 'a' THEN 'AS' ELSE NULL END) continent,
          x.food_loc country, COUNT(y.organic) AS Organic
          COUNT(CASE WHEN z.food_type = '189' THEN 1 END) AS Granny,
          COUNT(CASE WHEN z.food_type = '190' THEN 1 END) AS Red_delc,
          COUNT(CASE WHEN z.food_type = '191' THEN 1 END) AS Golden,
          COUNT(CASE WHEN z.food_type = '192' THEN 1 END) AS Gala,
          COUNT(CASE WHEN z.food_type = '193' THEN 1 END) AS Fuji      
    FROM x LEFT JOIN z ON x.food_item = z.food_item
           LEFT JOIN y on x.food_item = y.food_item and y.organic = 'Y'    
               WHERE  x.exp_date > sysdate
    GROUP BY SUBSTR (x.food_area, 4, 1), x.food_loc, y.organic) h
GROUP BY h.continent, h.country, h.organic

I'm not getting the correct output, since for example, Laos will show TWICE to account for the organic count and non-organic count. So it will show 1 organic_fd and 0 organic_fd and 1 fuji_bag and the other line will be another 1 fuji_bag. I would like the TOTAL count. (Also, if I add more food items, my mixed_bag shows mostly "1" count for each record/lines).

Below is the desired output:

| continent | country   |organic_fd | granny_bag| red_bag| golden_bag| gala_bag|fuji_bag | mixed_bag
| SA        | argentina |    1      | 1         |   1    | 0         | 0       | 0       | 1
| SA        | peru      |    0      | 1         |   0    | 0         | 0       | 0       | 0
| NA        | usa       |    1      | 0         |   0    | 1         | 0       | 0       | 0
| EU        | norway    |    0      | 0         |   0    | 0         | 1       | 0       | 0
| AS        | laos      |    1      | 0         |   0    | 0         | 0       | 2       | 0

So, say I want to add another food item, G1 from Norway and it has 3 types of organic apples: fuji, red, granny... then Norway will now have a count of 1 for the following columns: mixed_bag, organic_fd, fuji_bag, red_bag ,granny_bag (in addition to the previous count of 1 gala_bag). If you add H1, which is exactly the same as G1, then it will now have a total count of 2 for the following: mixed_bag, organic_fd, fuji_bag,red_bag, granny_bag

Upvotes: 0

Views: 89

Answers (2)

Andrei Odegov
Andrei Odegov

Reputation: 3429

The query:

WITH
  t AS (
    SELECT
      CASE SUBSTR(X.food_area, LENGTH(X.food_area), 1)
        WHEN 's' THEN 'SA'
        WHEN 'n' THEN 'NA'
        WHEN 'e' THEN 'EU'
        WHEN 'a' THEN 'AS'
        ELSE NULL
      END AS continent,
      x.food_loc AS country,
      COUNT(DISTINCT CASE Y.organic WHEN 'Y' THEN X.food_item END) OVER (
        PARTITION BY x.food_loc
      ) AS organic_fd,
      CASE
        WHEN MIN(Z.food_type) OVER (
               PARTITION BY x.food_loc, X.food_item
             ) = Z.food_type AND
             MAX(Z.food_type) OVER (
               PARTITION BY x.food_loc, X.food_item
             ) > Z.food_type THEN 1 END AS mixed,
      Z.food_type
    FROM X
    JOIN Y ON X.food_item = Y.food_item
    JOIN Z ON Y.food_item = Z.food_item
  )
SELECT
  continent, country, organic_fd,
  COUNT(CASE WHEN food_type = '189' THEN 1 END) AS Granny,
  COUNT(CASE WHEN food_type = '190' THEN 1 END) AS Red_delc,
  COUNT(CASE WHEN food_type = '191' THEN 1 END) AS Golden,
  COUNT(CASE WHEN food_type = '192' THEN 1 END) AS Gala,
  COUNT(CASE WHEN food_type = '193' THEN 1 END) AS Fuji,
  COUNT(mixed) AS mixed_bag
FROM t
GROUP BY continent, country, organic_fd

You can try this query here: https://rextester.com/TSSH87409.

Upvotes: 1

Ponder Stibbons
Ponder Stibbons

Reputation: 14848

You have one to many relationship between x and z, and join may produce many rows for each row in x, like in case of A1. So you have to number rows in x at first, this is what my subquery t1 do, except of mapping values. Then group them taking max() for each counted column (granny, organic etc.), like in subquery t2. Finally sum values.

dbfiddle demo

with
  t1 as (
    select rn, food_item, food_area, food_loc country, food_exp, food_type,
           decode(substr(food_area, 4, 1), 's', 'SA', 'n', 'NA', 'e', 'EU', 'a', 'AS') continent,
           case organic when 'Y' then 1 else 0 end org,
           case when food_type = '189' then 1 else 0 end gra,
           case when food_type = '190' then 1 else 0 end red,
           case when food_type = '191' then 1 else 0 end gol,
           case when food_type = '192' then 1 else 0 end gal,
           case when food_type = '193' then 1 else 0 end fuj 
      from (select rownum rn, x.* from x) x join y using (food_item) join z using (food_item)
      where food_exp > sysdate),
  t2 as (
    select rn, country, continent, max(org) org, max(gra) gra, 
           max(red) red, max(gol) gol, max(gal) gal, max(fuj) fuj,
           case when max(gra) + max(red) + max(gol) + max(gal) + max(fuj) > 1 
                then 1 else 0 
            end mix
       from t1 group by rn, country, continent)
select continent, country, sum(org) organic_fd, sum(gra) granny, sum(red) red_delc, 
       sum(gol) golden_bag, sum(gal) gala_bag, sum(fuj) fuji_bag, sum(mix) mixed_bag 
  from t2 
  group by continent, country

Above query gave expected output, please test it and adjust if needed. I noticed you use left joins. If there is possibility that for some rows in X there is no data in Y or Z you may have to add nvl()s in calculations. Maybe you should also put mapped, hardcoded values into tables. Hardcoding them is not good practice. Hope this helps :)

Upvotes: 0

Related Questions