Reputation: 67

How to include SUM not as column but as separate line in the result set

The query is as follows:

  SELECT  tc.CATEGORY_NAME,
  COUNT(dc.CONSULT_ID) AS "TEST_COUNT"
  FROM DOCTOR_CONSULT dc 
  INNER JOIN CONSULT_LABTEST cl 
  on(dc.CONSULT_ID=cl.CONSULT_ID)
  INNER JOIN TEST_SETUP ts
  on(cl.LABTEST_ID=ts.TEST_ID)
  INNER JOIN TEST_CATEGORY tc
  ON (ts.CATEGORY_ID=tc.CATEGORY_ID)
 INNER JOIN OFFICE_DETAILS od 
 on(dc.OFFICE_ID=od.OFFICE_ID)  
 WHERE dc.CONSULT_DATE BETWEEN CURRENT_DATE AND CURRENT_DATE
 AND ts.ACTIVE_STATUS ='Y'
 AND tc.ACTIVE_STATUS ='Y'
 AND od.ACTIVE_STATUS ='Y'
 AND ts.LAB_TYPE IN('L')
 GROUP BY tc.CATEGORY_NAME;

The result is as follows:

CATEGORY_NAME   TEST_COUNT
Biochemistry        7
Hematology          6
Hormones            1
Clinical Pathology  1

What i require

CATEGORY_NAME   TEST_COUNT
Biochemistry        7
Hematology          6
Hormones            1
Clinical Pathology  1
SUM                 15

what is the modification required for the above SQL code

Upvotes: 0

Answers (4)

Lars Br.

Reputation: 10396

The previous answers point to a working solution approach: running two queries with different aggregation levels and UNION the result sets.

Sadly, this approach is the worst possible approach as it doubles the work for the developer (the same code for the joins needs to be exactly repeated for the second aggregation level). It also doubles the work for the database system as the query optimiser does not understand that the SUM() of the previous aggregation can simply be computed by adding up the already computed results.

This can be improved by using slightly less basic SQL features.

1. improvement approach.

At the very least one should try to avoid having to type the same join and filter conditions more than once. If this ever has to be changed in the future trying to change it consistently in all places will likely be painful. Therefore, try to reuse those definitions, e.g. by declaring views, or using the WITH clause:

CREATE  COLUMN TABLE doctor_consult 
    ( consult_id int PRIMARY KEY NOT null
    , doctor_name nvarchar(200) NOT NULL);


CREATE COLUMN TABLE labtests 
    ( lab_test_id int NOT NULL 
    , consult_id int NOT NULL 
    , test_name nvarchar(200) NOT NULL
    , test_result int NOT NULL 
    , PRIMARY KEY (lab_test_id, consult_id));

INSERT INTO doctor_consult VALUES (1, 'Dr. Mayer');
INSERT INTO doctor_consult VALUES (2, 'Dr. Green');
INSERT INTO doctor_consult VALUES (3, 'Dr. Blue');
INSERT INTO doctor_consult VALUES (4, 'Dr. Chase');


INSERT INTO labtests VALUES (1, 1, 'Colour', 12);
INSERT INTO labtests VALUES (2, 1, 'Smell', 45);
INSERT INTO labtests VALUES (3, 1, 'Roundness', 78);
INSERT INTO labtests VALUES (4, 1, 'Shearfactor', 34);

INSERT INTO labtests VALUES (5, 2, 'Colour', 67);
INSERT INTO labtests VALUES (6, 2, 'Roundness', 9);
INSERT INTO labtests VALUES (7, 3, 'Colour', 76);
INSERT INTO labtests VALUES (8, 3, 'Roundness', 43);
INSERT INTO labtests VALUES (9, 4, 'Roundness', 12);
INSERT INTO labtests VALUES (10, 4, 'Smell', 54);
INSERT INTO labtests VALUES (11, 4, 'Roundness', 24);

SELECT 
    dc.* , lt.*
FROM 
    doctor_consult dc
    INNER JOIN labtests lt 
    ON dc.consult_id = lt.consult_id;

--CONSULT_ID|DOCTOR_NAME|LAB_TEST_ID|CONSULT_ID|TEST_NAME  |TEST_RESULT|
------------+-----------+-----------+----------+-----------+-----------+
--         1|Dr. Mayer  |          1|         1|Colour     |         12|
--         1|Dr. Mayer  |          2|         1|Smell      |         45|
--         1|Dr. Mayer  |          3|         1|Roundness  |         78|
--         1|Dr. Mayer  |          4|         1|Shearfactor|         34|
--         2|Dr. Green  |          5|         2|Colour     |         67|
--         2|Dr. Green  |          6|         2|Roundness  |          9|
--         3|Dr. Blue   |          7|         3|Colour     |         76|
--         3|Dr. Blue   |          8|         3|Roundness  |         43|
--         4|Dr. Chase  |          9|         4|Roundness  |         12|
--         4|Dr. Chase  |         10|         4|Smell      |         54|
--         4|Dr. Chase  |         11|         4|Roundness  |         24|

Let's aggregate this demo data:


SELECT 
    lt.test_name, sum(lt.test_result)
FROM 
    doctor_consult dc
    INNER JOIN labtests lt 
    ON dc.consult_id = lt.consult_id
GROUP BY lt.test_name;

--
--TEST_NAME  |SUM(TEST_RESULT)|
-------------+----------------+
--Shearfactor|              34|
--Colour     |             155|
--Roundness  |             166|
--Smell      |              99|

Now, let's add the "SUM" column, but without repeating the join definition:

WITH base_agg AS (
SELECT 
    lt.test_name, sum(lt.test_result) AS result_agg
FROM 
    doctor_consult dc
    INNER JOIN labtests lt 
    ON dc.consult_id = lt.consult_id
GROUP BY lt.test_name)

SELECT test_name, result_agg  FROM base_agg
UNION ALL 
SELECT 'SUM', SUM(result_agg) AS result_agg FROM base_agg;

--
--TEST_NAME  |RESULT_AGG|
-------------+----------+
--Shearfactor|        34|
--Colour     |       155|
--Roundness  |       166|
--Smell      |        99|
--SUM        |       454|

That was easy, wasn't it? Also, using UNION ALL is more efficient than UNION in this case, because we know for sure that the groups will be unique.

This approach, however, still leaves the database engine to do more work than necessary. At the very least, the result set of the base aggregation will likely be materialised just to compute SUM() for all groups.

2. improvement approach.

The SQL standard has a standard feature for this very common requirement: GROUP BY ROLLUP. With this feature, developers can tell the database engine that the aggregation should be done on multiple levels.

SELECT 
    IFNULL(lt.test_name, 'SUM') AS TEST_NAME
    , sum(lt.test_result) AS result_agg
FROM 
    doctor_consult dc
    INNER JOIN labtests lt 
    ON dc.consult_id = lt.consult_id
GROUP BY ROLLUP  (lt.test_name);

--TEST_NAME  |RESULT_AGG|
-------------+----------+
--Colour     |       155|
--Smell      |        99|
--Roundness  |       166|
--Shearfactor|        34|
--SUM        |       454|

The IFNULL() is used to put a custom text into the computed total row - it would be NULL without that.

Note, this solution is not HANA specific - this is SQL standard functionality and most SQL DBMS nowadays support this.

3. Remarks

The second approach should be the more efficient one - after all HANA gets more specific information about what should be computed. This should allow for a more efficient plan for the query execution. But if one checks the actual execution plans generated for this toy example on a HANA 2 SP5 system, the reality of product design decisions shows up:

Features like WITH-clause and UNION (ALL) are used very often by many developers and programs. Just look at the previous answers to this question.

More specific features like GROUP BY ROLLUP are used a lot less often. For the developers of the HANA execution engines, this means: these features are waaaaay down the priority list to implement and optimise.

The execution plan for the WITH + UNION approach gets executed in the HEX engine - in about 1.06 ms, using 1.2MB of RAM.

The GROUP BY ROLLUP approach is executed by the much older JOIN ENGINE and requires the materialisation of the result set and conversion into a ROW STORE format. This requires ~ 4.11 ms and 2.8MB of RAM.

As we see: "worse is better" in this case. Once again, we have a good example of why it is so important to check the performance of SQL statements early on.

Upvotes: 3

chandan singh

Reputation: 328

To include the sum of all the test counts at the bottom of the result set, you can use the UNION operator to combine the original query with a subquery that retrieves the sum of the test counts.

SELECT  tc.CATEGORY_NAME,
    COUNT(dc.CONSULT_ID) AS "TEST_COUNT"
    FROM DOCTOR_CONSULT dc 
    INNER JOIN CONSULT_LABTEST cl 
    on(dc.CONSULT_ID=cl.CONSULT_ID)
    INNER JOIN TEST_SETUP ts
    on(cl.LABTEST_ID=ts.TEST_ID)
    INNER JOIN TEST_CATEGORY tc
    ON (ts.CATEGORY_ID=tc.CATEGORY_ID)
    INNER JOIN OFFICE_DETAILS od 
    on(dc.OFFICE_ID=od.OFFICE_ID)  
    WHERE dc.CONSULT_DATE BETWEEN CURRENT_DATE AND CURRENT_DATE
    AND ts.ACTIVE_STATUS ='Y'
    AND tc.ACTIVE_STATUS ='Y'
    AND od.ACTIVE_STATUS ='Y'
    AND ts.LAB_TYPE IN('L')
    GROUP BY tc.CATEGORY_NAME
    UNION
    SELECT 'SUM' AS CATEGORY_NAME, SUM(TEST_COUNT) as TEST_COUNT
    FROM (
        SELECT COUNT(dc.CONSULT_ID) as TEST_COUNT
        FROM DOCTOR_CONSULT dc 
        INNER JOIN CONSULT_LABTEST cl 
        on(dc.CONSULT_ID=cl.CONSULT_ID)
        INNER JOIN TEST_SETUP ts
        on(cl.LABTEST_ID=ts.TEST_ID)
        INNER JOIN TEST_CATEGORY tc
        ON (ts.CATEGORY_ID=tc.CATEGORY_ID)
        INNER JOIN OFFICE_DETAILS od 
        on(dc.OFFICE_ID=od.OFFICE_ID)  
        WHERE dc.CONSULT_DATE BETWEEN CURRENT_DATE AND CURRENT_DATE
        AND ts.ACTIVE_STATUS ='Y'
        AND tc.ACTIVE_STATUS ='Y'
        AND od.ACTIVE_STATUS ='Y'
        AND ts.LAB_TYPE IN('L')
       GROUP BY tc.CATEGORY_NAME)

Upvotes: 0

OZ Tech

Reputation: 1

You can add a UNION without the group to get the total count you want:

SELECT  
  tc.CATEGORY_NAME,
  COUNT(dc.CONSULT_ID) AS "TEST_COUNT"
FROM DOCTOR_CONSULT dc 
INNER JOIN CONSULT_LABTEST cl 
on(dc.CONSULT_ID=cl.CONSULT_ID)
INNER JOIN TEST_SETUP ts
on(cl.LABTEST_ID=ts.TEST_ID)
INNER JOIN TEST_CATEGORY tc
ON (ts.CATEGORY_ID=tc.CATEGORY_ID)
INNER JOIN OFFICE_DETAILS od 
on(dc.OFFICE_ID=od.OFFICE_ID)  
WHERE dc.CONSULT_DATE BETWEEN CURRENT_DATE AND CURRENT_DATE
  AND ts.ACTIVE_STATUS ='Y'
  AND tc.ACTIVE_STATUS ='Y'
  AND od.ACTIVE_STATUS ='Y'
  AND ts.LAB_TYPE IN('L')
GROUP BY tc.CATEGORY_NAME

UNION

SELECT 
  'SUM' AS CATEGORY_NAME, 
  COUNT(*) AS TEST_COUNT
FROM DOCTOR_CONSULT dc 
INNER JOIN CONSULT_LABTEST cl 
on(dc.CONSULT_ID=cl.CONSULT_ID)
INNER JOIN TEST_SETUP ts
on(cl.LABTEST_ID=ts.TEST_ID)
INNER JOIN TEST_CATEGORY tc
ON (ts.CATEGORY_ID=tc.CATEGORY_ID)
INNER JOIN OFFICE_DETAILS od 
on(dc.OFFICE_ID=od.OFFICE_ID)  
WHERE dc.CONSULT_DATE BETWEEN CURRENT_DATE AND CURRENT_DATE
  AND ts.ACTIVE_STATUS ='Y'
  AND tc.ACTIVE_STATUS ='Y'
  AND od.ACTIVE_STATUS ='Y'
  AND ts.LAB_TYPE IN('L')

Upvotes: 0

Yuriy

Reputation: 1

union

and sum as SUM previous select

Upvotes: -1

How to include SUM not as column but as separate line in the result set

Answers (4)

1. improvement approach.

2. improvement approach.

3. Remarks

Related Questions