Xivilai
Xivilai

Reputation: 2521

Using the HAVING Clause with GROUP by to Return Unique Records

Good day all,

I am having difficulty understanding the mechanics of the GROUP BY AND HAVING clause and was hoping for some advice.

I am trying to query two tables - PRODUCTS and ORDER_ITEMS. The PRODUCT_ID column is used to link these two tables.

I wish to view products which have been ordered from a certain supplier (filtered using the SUPPLIER_ID column which is in ORDER_ITEMS); have been successfully ordered before (ORDER_STATUS 6 in ORDER_ITEMS);and which have not been deleted (RECORD_DELETED column in ORDER_ITEMS). I only use the PRODUCTS table to show the name of the product. Furthermore I only want distinct products returned, meaning I want to exclude any results which duplicate the PRODUCT_ID column

This is the query that I am using:

SELECT 
       PD.PRODUCT_ID,
       PD.PRODUCT_NAME,
       PD.BARCODE,
       PD.SUPPLIER_BARCODE,
       COUNT(PD.PRODUCT_ID) AS COUNTED,
       ODI.ORDER_ITEM_ID
FROM PRODUCTS PD
INNER JOIN ORDER_ITEMS ODI
ON PD.PRODUCT_ID = ODI.PRODUCT_ID
WHERE ODI.SUPPLIER_ID = 34359738399
  AND ORDER_STATUS = 6
  AND ODI.RECORD_DELETED = 0
GROUP BY PD.PRODUCT_ID,PD.PRODUCT_NAME,PD.BARCODE,PD.SUPPLIER_BARCODE,ODI.ORDER_ITEM_ID
HAVING COUNT(ODI.PRODUCT_ID) = 1
ORDER BY PRODUCT_ID ASC

Unfortunately this is returning 502 records with many of them duplicating the PRODUCT_ID. If I remove the ORDER_ITEM_ID column from the query 175 records are returned. These 175 records are products that meet the criteria given above. The problem is that I also need to pull the ORDER_ITEM_ID from ORDER_ITEMS (along with some other columns).

I vaguely understand that when I include ORDER_ITEMS the query is going to group the data by the ORDER_ITEM column and so will count the PRODUCT_ID values based on each individual ORDER_ITEM_ID. This results in there always being a count of 1 for each product.

How does one get around this? Also, is there a more suitable way of carrying out this task which would allow me to include one ORDER_ITEM record for every duplicated product? Rather than omitting them altogether as I am doing above?

This is some of the data that is returned by the query above:

PRODUCT_ID,PRODUCT_NAME,BARCODE,SUPPLIER_BARCODE,COUNTED,ORDER_ITEM_ID

34359738628,ADCORTYL INTRA-ARTIC/DERMAL 10MG/ML 5ML,5099627022132,5012712000037,1,34359755708
34359739609,ARTELAC 3.2MG/ML EYE DROPS SOLN,5099627456722,5027519008933,1,34359741719
34359739626,ASACOLON 500MG SUPPOSITORIES,5099627516587,5015313012737,1,34359742783
34359739767,ATROVENT 250MCG/1ML UDV NEB SOLN,5099627639637,5012816012561,1,34359738421
34359739770,ATROVENT 500MCG/2ML UDV NEB SOLN,5099627460293,5012816012592,1,34359743524
34359739893,AZOPT 10MG/ML EYE DROPS SUSP,5099627831543,5015664002753,1,34359749091
34359739893,AZOPT 10MG/ML EYE DROPS SUSP,5099627831543,5015664002753,1,34359749687
34359739893,AZOPT 10MG/ML EYE DROPS SUSP,5099627831543,5015664002753,1,34359749715
34359739893,AZOPT 10MG/ML EYE DROPS SUSP,5099627831543,5015664002753,1,34359754053
34359740053,BACTIGRAS MED DRSS 10CMX10CM STERILE GMS,5099627672368,5000223421984,1,34359748101
34359740062,BACTROBAN 2% OINTMENT,5099627053914,5099211003165,1,34359755226
34359740558,BETNOVATE RD CREAM,5099627005692,5099211001642,1,34359752422
34359740558,BETNOVATE RD CREAM,5099627005692,5099211001642,1,34359738487
34359741045,BISODOL ANTACID TABS,5099627057707,5014398001438,1,34359750542
34359741995,BROLENE 0.1% EYE DROPS SOLN,5099627006323,50982790,1,34359746555
34359741995,BROLENE 0.1% EYE DROPS SOLN,5099627006323,50982790,1,34359751650
34359741995,BROLENE 0.1% EYE DROPS SOLN,5099627006323,50982790,1,34359751783
34359742132,BURINEX 1MG TABS,5099627551328,5702191004212,1,34359749705
34359742152,BUSCOPAN 20MG/ML SOLN FOR INJ,5099627006620,5012816018532,1,34359749083

In the example above, several records were returned with duplicate PRODUCT_ID values e.g ASACOLON 500MG SUPPOSITORIES

Upvotes: 1

Views: 1546

Answers (1)

Lukasz Szozda
Lukasz Szozda

Reputation: 175766

You need GROUP_CONCAT/LISTAGG equivalent in SQL Server. You can use XML, STUFF and correlated subquery as replacement.

If PRODUCT_ID is UNIQUE you can use:

WITH cte AS
(
  SELECT 
       PD.PRODUCT_ID,
       PD.PRODUCT_NAME,
       PD.BARCODE,
       PD.SUPPLIER_BARCODE,
       ODI.ORDER_ITEM_ID
  FROM PRODUCTS PD
  JOIN ORDER_ITEMS ODI
    ON PD.PRODUCT_ID = ODI.PRODUCT_ID
  WHERE ODI.SUPPLIER_ID = 34359738399
    AND ORDER_STATUS = 6
    AND ODI.RECORD_DELETED = 0
)
SELECT PRODUCT_ID,
       PRODUCT_NAME,
       BARCODE,
       SUPPLIER_BARCODE,
       [COUNTED] = COUNT(PD.PRODUCT_ID),
       [ORDER_ITEM_ID] = STUFF((SELECT CONCAT(',' , ORDER_ITEM_ID)
                                FROM cte c2
                                WHERE c2.PRODUCT_ID = c1.PRODUCT_ID
                                ORDER BY c2.ORDER_ITEM_ID
                                FOR XML PATH ('')), 1, 1, '')
FROM cte c1
GROUP BY PRODUCT_ID,PRODUCT_NAME,BARCODE,SUPPLIER_BARCODE
HAVING COUNT(PRODUCT_ID) = 1
ORDER BY PRODUCT_ID ASC;

LiveDemo_SimplifiedVersion

Otherwise correlate using multiple columns:

SELECT CONCAT(',' , ORDER_ITEM_ID)
FROM cte c2
WHERE c2.PRODUCT_ID = c1.PRODUCT_ID
  AND c2.PRODUCT_NAME = c1.PRODUCT_NAME
  AND ...
ORDER BY c2.ORDER_ITEM_ID
FOR XML PATH ('')), 1, 1, '')

Upvotes: 2

Related Questions