Robert Brown
Robert Brown

Reputation: 11026

SQL conditional grouping and summing

My data looks like this:

|cat |subcat |amount|
---------------------
|A   |1      |123   |
|A   |2      |456   |
|B   |1      |222   |
|B   |2      |333   |

In the first case, I need to sum by cat and subcat. Easy:

SELECT cat, subcat, sum(amount) FROM data GROUP BY cat, subcat

Next, I have a more sophisticated requirement where for certain cats, the amount should be "pushed" into a given subcat. This can be stored in another config table:

|cat |subcat|
-------------
|B   |1     |

This tells me that for all cat='B' rows, the amount should be treated as a subcat=1. Furthermore, where cat='B' AND subcat <> 1 the amount should be reported as zero. In other words, the result I need is:

|cat |subcat|amount|
|A   |1     |123   |
|A   |2     |456   |
|B   |1     |555   |
|B   |2     |0     |

I cannot update my data table. Of course I can SELECT ... INTO in a proc and fix the data there, but I'm wondering if it can be done in one hit.

I can get pretty close with:

SELECT data.cat,
    ISNULL(config.subcat, data.subcat),
    SUM(amount)
FROM data
    LEFT OUTER JOIN config ON (data.cat = config.cat)
GROUP BY data.cat, ISNULL(config.subcat, data.subcat)

...but fails my second requirement to show cat:B, subcat:2 as zero.

Is it possible?

I'm using Sybase IQ 12.5 (i.e. old T-SQL, but is has the case statement, which I suspect might be useful)

Upvotes: 4

Views: 3109

Answers (6)

walterhuang
walterhuang

Reputation: 632

I am using tsql and here is my code. It's ugly but works. Actually, I like your pretty-close approach (if you don't insist showing B2 = 0).

SELECT A.cat,
       A.subcat,
       CASE WHEN B.IsConfig = 0 THEN A.amount
            WHEN B.IsConfig = 1 AND C.cat IS NULL THEN 0
            ELSE B.amount 
       END AS amount
FROM data A
INNER JOIN 
(
    SELECT B1.cat, B1.amount, CASE WHEN C1.cat IS NULL THEN 0 ELSE 1 END AS IsConfig
    FROM
    (
        SELECT cat, SUM(amount) amount
        FROM data
        GROUP BY cat
    ) B1 LEFT OUTER JOIN config C1 ON B1.cat = C1.cat
) B ON A.cat = B.cat
LEFT OUTER JOIN config C ON A.cat = C.cat AND A.subcat = C.subcat

--- I can't comment on others so I add my question here ---

Compared my code with others using Execution Plan, my query cost is 46%. Does that mean it's more efficient? Or it just depends :)

Upvotes: 0

Andriy M
Andriy M

Reputation: 77747

This is a bit similar to your solution, but UNION is only used to build a list of categories and subcategories. The list is then joined with another derived table, which is essentially the same as the right part of your UNION. Here goes:

SELECT s.cat, s.subcat, ISNULL(SUM(d.amount), 0)
FROM (
  SELECT cat, subcat FROM data
  UNION
  SELECT cat, subcat FROM config
) s
  LEFT JOIN (
    SELECT
      d.cat,
      subcat = ISNULL(c.subcat, d.subcat),
      d.amount
    FROM data d
      LEFT JOIN config c ON d.cat = c.cat
  ) d ON s.cat = d.cat AND s.subcat = d.subcat
GROUP BY s.cat, s.subcat

Upvotes: 0

pilcrow
pilcrow

Reputation: 58741

Compute SUM(amount) for all "cat"s referenced in "config" in a derived table, then match that with your "data" table entries as appropriate:

   SELECT data.cat,
          data.subcat,
          CASE 
            WHEN dt.subcat IS NULL       -- no "config" entry for cat
              THEN data.amount
            WHEN dt.subcat = data.subcat -- "config" for cat and subcat
              THEN dt.total
            ELSE 0                       -- "config" for cat not subcat
          END AS amount
     FROM data
LEFT JOIN (  SELECT config.cat,
                    config.subcat,
                    SUM(data.amount) AS total
               FROM config
               JOIN data USING (cat)
           GROUP BY 1, 2 ) dt
           USING (cat);

+-----+--------+--------+
| cat | subcat | amount |
+-----+--------+--------+
| A   |      1 |    123 |
| A   |      2 |    456 |
| B   |      1 |    555 |
| B   |      2 |      0 |
+-----+--------+--------+
4 rows in set (0.00 sec)

Upvotes: 0

Conrad Frix
Conrad Frix

Reputation: 52675

You'll need to a join Data -> Config -> Data to translate the B2 to B1 then UNION that to a SELECT with a Case statement and then the SUM and GROUP BY is easy

SELECT
   t.CAT, 
   t.SUBCAT, 
   SUM(t.AMOUNT) AMOUNT
FROM
(
SELECT d.cat, 
       d.subcat, 
       CASE 
         WHEN c.subcat IS NULL 
               OR c.subcat = d.subcat THEN d.amount 
         ELSE 0 
       END AS amount 
FROM   data d 
       LEFT JOIN config c 
         ON d.cat = c.cat 
         
UNION ALL 

SELECT d.cat, 
       d.subcat, 
       d2.amount 
FROM   data d 
       INNER JOIN config c 
         ON ( d.cat = c.cat ) 
       INNER JOIN data d2 
         ON c.cat = d2.cat 
            AND c.subcat <> d2.subcat 
            AND c.subcat = d.subcat 
) t
GROUP BY
    cat,
    subcat
ORDER BY
    cat,
    subcat
​

You can see a working example at this data.se query.

Note I added a third "B" value to test where there more than one rolled up SubCat

Another approach that uses the WITH and ROLLUP clauses (which are supported in some versions of Sybase I don't know which)

with g as ( 
    SELECT 
   
        d.cat, 
        d.subcat,
        c.subcat config_subcat, 
        sum(amount) amount,
        GROUPING(c.subcat) subcatgroup
    FROM   data d 
    LEFT JOIN config c
    ON d.cat = c.cat 
    
   GROUP BY
      d.cat, 
      d.subcat,
      c.subcat with rollup
)

SELECT
   g.cat, 
   g.subcat,
   case when g.config_subcat is null then g.amount 
     WHEN g.subcat = g.config_subcat THEN g2.amount 
     ELSE 0 end amount
FROM g 

     LEFT JOIN g g2
     ON g.cat = g2.cat and g2.subcatgroup= 1
     and g.subcat is not null and g2.subcat is null

WHERE g.subcatgroup= 0​​

Which can be viewed at this data.se query

Upvotes: 1

Robert Brown
Robert Brown

Reputation: 11026

Here's what I came up with.

SELECT cat, subcat, sum(amount)
FROM
(
    SELECT d.cat,
        d.subcat,
        CASE WHEN c.subcat <> d.subcat THEN 0 ELSE amount END amount
    FROM data d
        LEFT OUTER JOIN config c ON (d.cat = c.cat)
    UNION    
    SELECT d.cat,
        ISNULL(c.subcat, d.subcat),
        amount
    FROM data d
        LEFT OUTER JOIN config c ON (d.cat = c.cat)
    WHERE c.subcat <> d.subcat
) AS data2
GROUP BY cat, subcat

Given that it uses a derived table with a union, and that my actual data set is much larger than the one I gave in the question, I think a SELECT ... INTO followed by an update might actually be the more performant approach!

Upvotes: 1

Jeremy Wiggins
Jeremy Wiggins

Reputation: 7299

I'm a little confused by the requirements, but I think this is what you want.

SELECT d.cat,
       d.subcat, 
       SUM(CASE 
           WHEN c.subcat IS NULL OR c.subcat = d.subcat 
           THEN d.amount 
           ELSE 0 
        END) as Amount
FROM @Data d
    LEFT OUTER JOIN @Config c ON (d.cat = c.cat)
GROUP BY d.cat, d.subcat
ORDER BY d.cat

Example here - https://data.stackexchange.com/stackoverflow/q/120507/

Let me know if that's not what you were going for.

Upvotes: 0

Related Questions