Reputation: 67
The query is as follows:
SELECT tc.CATEGORY_NAME,
COUNT(dc.CONSULT_ID) AS "TEST_COUNT"
FROM DOCTOR_CONSULT dc
INNER JOIN CONSULT_LABTEST cl
on(dc.CONSULT_ID=cl.CONSULT_ID)
INNER JOIN TEST_SETUP ts
on(cl.LABTEST_ID=ts.TEST_ID)
INNER JOIN TEST_CATEGORY tc
ON (ts.CATEGORY_ID=tc.CATEGORY_ID)
INNER JOIN OFFICE_DETAILS od
on(dc.OFFICE_ID=od.OFFICE_ID)
WHERE dc.CONSULT_DATE BETWEEN CURRENT_DATE AND CURRENT_DATE
AND ts.ACTIVE_STATUS ='Y'
AND tc.ACTIVE_STATUS ='Y'
AND od.ACTIVE_STATUS ='Y'
AND ts.LAB_TYPE IN('L')
GROUP BY tc.CATEGORY_NAME;
The result is as follows:
CATEGORY_NAME TEST_COUNT
Biochemistry 7
Hematology 6
Hormones 1
Clinical Pathology 1
What i require
CATEGORY_NAME TEST_COUNT
Biochemistry 7
Hematology 6
Hormones 1
Clinical Pathology 1
SUM 15
what is the modification required for the above SQL code
Upvotes: 0
Views: 79
Reputation: 10396
The previous answers point to a working solution approach: running two queries with different aggregation levels and UNION the result sets.
Sadly, this approach is the worst possible approach as it doubles the work for the developer (the same code for the joins needs to be exactly repeated for the second aggregation level). It also doubles the work for the database system as the query optimiser does not understand that the SUM()
of the previous aggregation can simply be computed by adding up the already computed results.
This can be improved by using slightly less basic SQL features.
At the very least one should try to avoid having to type the same join and filter conditions more than once. If this ever has to be changed in the future trying to change it consistently in all places will likely be painful. Therefore, try to reuse those definitions, e.g. by declaring views, or using the WITH
clause:
CREATE COLUMN TABLE doctor_consult
( consult_id int PRIMARY KEY NOT null
, doctor_name nvarchar(200) NOT NULL);
CREATE COLUMN TABLE labtests
( lab_test_id int NOT NULL
, consult_id int NOT NULL
, test_name nvarchar(200) NOT NULL
, test_result int NOT NULL
, PRIMARY KEY (lab_test_id, consult_id));
INSERT INTO doctor_consult VALUES (1, 'Dr. Mayer');
INSERT INTO doctor_consult VALUES (2, 'Dr. Green');
INSERT INTO doctor_consult VALUES (3, 'Dr. Blue');
INSERT INTO doctor_consult VALUES (4, 'Dr. Chase');
INSERT INTO labtests VALUES (1, 1, 'Colour', 12);
INSERT INTO labtests VALUES (2, 1, 'Smell', 45);
INSERT INTO labtests VALUES (3, 1, 'Roundness', 78);
INSERT INTO labtests VALUES (4, 1, 'Shearfactor', 34);
INSERT INTO labtests VALUES (5, 2, 'Colour', 67);
INSERT INTO labtests VALUES (6, 2, 'Roundness', 9);
INSERT INTO labtests VALUES (7, 3, 'Colour', 76);
INSERT INTO labtests VALUES (8, 3, 'Roundness', 43);
INSERT INTO labtests VALUES (9, 4, 'Roundness', 12);
INSERT INTO labtests VALUES (10, 4, 'Smell', 54);
INSERT INTO labtests VALUES (11, 4, 'Roundness', 24);
SELECT
dc.* , lt.*
FROM
doctor_consult dc
INNER JOIN labtests lt
ON dc.consult_id = lt.consult_id;
--CONSULT_ID|DOCTOR_NAME|LAB_TEST_ID|CONSULT_ID|TEST_NAME |TEST_RESULT|
------------+-----------+-----------+----------+-----------+-----------+
-- 1|Dr. Mayer | 1| 1|Colour | 12|
-- 1|Dr. Mayer | 2| 1|Smell | 45|
-- 1|Dr. Mayer | 3| 1|Roundness | 78|
-- 1|Dr. Mayer | 4| 1|Shearfactor| 34|
-- 2|Dr. Green | 5| 2|Colour | 67|
-- 2|Dr. Green | 6| 2|Roundness | 9|
-- 3|Dr. Blue | 7| 3|Colour | 76|
-- 3|Dr. Blue | 8| 3|Roundness | 43|
-- 4|Dr. Chase | 9| 4|Roundness | 12|
-- 4|Dr. Chase | 10| 4|Smell | 54|
-- 4|Dr. Chase | 11| 4|Roundness | 24|
Let's aggregate this demo data:
SELECT
lt.test_name, sum(lt.test_result)
FROM
doctor_consult dc
INNER JOIN labtests lt
ON dc.consult_id = lt.consult_id
GROUP BY lt.test_name;
--
--TEST_NAME |SUM(TEST_RESULT)|
-------------+----------------+
--Shearfactor| 34|
--Colour | 155|
--Roundness | 166|
--Smell | 99|
Now, let's add the "SUM"
column, but without repeating the join definition:
WITH base_agg AS (
SELECT
lt.test_name, sum(lt.test_result) AS result_agg
FROM
doctor_consult dc
INNER JOIN labtests lt
ON dc.consult_id = lt.consult_id
GROUP BY lt.test_name)
SELECT test_name, result_agg FROM base_agg
UNION ALL
SELECT 'SUM', SUM(result_agg) AS result_agg FROM base_agg;
--
--TEST_NAME |RESULT_AGG|
-------------+----------+
--Shearfactor| 34|
--Colour | 155|
--Roundness | 166|
--Smell | 99|
--SUM | 454|
That was easy, wasn't it? Also, using UNION ALL
is more efficient than UNION
in this case, because we know for sure that the groups will be unique.
This approach, however, still leaves the database engine to do more work than necessary. At the very least, the result set of the base aggregation will likely be materialised just to compute SUM()
for all groups.
The SQL standard has a standard feature for this very common requirement: GROUP BY ROLLUP
. With this feature, developers can tell the database engine that the aggregation should be done on multiple levels.
SELECT
IFNULL(lt.test_name, 'SUM') AS TEST_NAME
, sum(lt.test_result) AS result_agg
FROM
doctor_consult dc
INNER JOIN labtests lt
ON dc.consult_id = lt.consult_id
GROUP BY ROLLUP (lt.test_name);
--TEST_NAME |RESULT_AGG|
-------------+----------+
--Colour | 155|
--Smell | 99|
--Roundness | 166|
--Shearfactor| 34|
--SUM | 454|
The IFNULL()
is used to put a custom text into the computed total row - it would be NULL
without that.
Note, this solution is not HANA specific - this is SQL standard functionality and most SQL DBMS nowadays support this.
The second approach should be the more efficient one - after all HANA gets more specific information about what should be computed. This should allow for a more efficient plan for the query execution. But if one checks the actual execution plans generated for this toy example on a HANA 2 SP5 system, the reality of product design decisions shows up:
Features like WITH
-clause and UNION (ALL)
are used very often by many developers and programs. Just look at the previous answers to this question.
More specific features like GROUP BY ROLLUP
are used a lot less often. For the developers of the HANA execution engines, this means: these features are waaaaay down the priority list to implement and optimise.
The execution plan for the WITH + UNION approach gets executed in the HEX engine - in about 1.06 ms, using 1.2MB of RAM.
The GROUP BY ROLLUP approach is executed by the much older JOIN ENGINE and requires the materialisation of the result set and conversion into a ROW STORE format. This requires ~ 4.11 ms and 2.8MB of RAM.
As we see: "worse is better" in this case. Once again, we have a good example of why it is so important to check the performance of SQL statements early on.
Upvotes: 3
Reputation: 328
To include the sum of all the test counts at the bottom of the result set, you can use the UNION operator to combine the original query with a subquery that retrieves the sum of the test counts.
SELECT tc.CATEGORY_NAME,
COUNT(dc.CONSULT_ID) AS "TEST_COUNT"
FROM DOCTOR_CONSULT dc
INNER JOIN CONSULT_LABTEST cl
on(dc.CONSULT_ID=cl.CONSULT_ID)
INNER JOIN TEST_SETUP ts
on(cl.LABTEST_ID=ts.TEST_ID)
INNER JOIN TEST_CATEGORY tc
ON (ts.CATEGORY_ID=tc.CATEGORY_ID)
INNER JOIN OFFICE_DETAILS od
on(dc.OFFICE_ID=od.OFFICE_ID)
WHERE dc.CONSULT_DATE BETWEEN CURRENT_DATE AND CURRENT_DATE
AND ts.ACTIVE_STATUS ='Y'
AND tc.ACTIVE_STATUS ='Y'
AND od.ACTIVE_STATUS ='Y'
AND ts.LAB_TYPE IN('L')
GROUP BY tc.CATEGORY_NAME
UNION
SELECT 'SUM' AS CATEGORY_NAME, SUM(TEST_COUNT) as TEST_COUNT
FROM (
SELECT COUNT(dc.CONSULT_ID) as TEST_COUNT
FROM DOCTOR_CONSULT dc
INNER JOIN CONSULT_LABTEST cl
on(dc.CONSULT_ID=cl.CONSULT_ID)
INNER JOIN TEST_SETUP ts
on(cl.LABTEST_ID=ts.TEST_ID)
INNER JOIN TEST_CATEGORY tc
ON (ts.CATEGORY_ID=tc.CATEGORY_ID)
INNER JOIN OFFICE_DETAILS od
on(dc.OFFICE_ID=od.OFFICE_ID)
WHERE dc.CONSULT_DATE BETWEEN CURRENT_DATE AND CURRENT_DATE
AND ts.ACTIVE_STATUS ='Y'
AND tc.ACTIVE_STATUS ='Y'
AND od.ACTIVE_STATUS ='Y'
AND ts.LAB_TYPE IN('L')
GROUP BY tc.CATEGORY_NAME)
Upvotes: 0
Reputation: 1
You can add a UNION without the group to get the total count you want:
SELECT
tc.CATEGORY_NAME,
COUNT(dc.CONSULT_ID) AS "TEST_COUNT"
FROM DOCTOR_CONSULT dc
INNER JOIN CONSULT_LABTEST cl
on(dc.CONSULT_ID=cl.CONSULT_ID)
INNER JOIN TEST_SETUP ts
on(cl.LABTEST_ID=ts.TEST_ID)
INNER JOIN TEST_CATEGORY tc
ON (ts.CATEGORY_ID=tc.CATEGORY_ID)
INNER JOIN OFFICE_DETAILS od
on(dc.OFFICE_ID=od.OFFICE_ID)
WHERE dc.CONSULT_DATE BETWEEN CURRENT_DATE AND CURRENT_DATE
AND ts.ACTIVE_STATUS ='Y'
AND tc.ACTIVE_STATUS ='Y'
AND od.ACTIVE_STATUS ='Y'
AND ts.LAB_TYPE IN('L')
GROUP BY tc.CATEGORY_NAME
UNION
SELECT
'SUM' AS CATEGORY_NAME,
COUNT(*) AS TEST_COUNT
FROM DOCTOR_CONSULT dc
INNER JOIN CONSULT_LABTEST cl
on(dc.CONSULT_ID=cl.CONSULT_ID)
INNER JOIN TEST_SETUP ts
on(cl.LABTEST_ID=ts.TEST_ID)
INNER JOIN TEST_CATEGORY tc
ON (ts.CATEGORY_ID=tc.CATEGORY_ID)
INNER JOIN OFFICE_DETAILS od
on(dc.OFFICE_ID=od.OFFICE_ID)
WHERE dc.CONSULT_DATE BETWEEN CURRENT_DATE AND CURRENT_DATE
AND ts.ACTIVE_STATUS ='Y'
AND tc.ACTIVE_STATUS ='Y'
AND od.ACTIVE_STATUS ='Y'
AND ts.LAB_TYPE IN('L')
Upvotes: 0