The_Nabzter
The_Nabzter

Reputation: 140

SQL: How to get distinct values out of string_Agg() function?

The following code

SELECT 
      DISTINCT(p.ID) AS ID
    , PIT.Code AS Code
    , year(PT.Date) AS Year
FROM fact.PreT PT
INNER JOIN dim.ProdIType PIT 
    ON PIT.ProdITypeSKey = PT.ProdITypeSKey
INNER JOIN dim.Proudct P
    ON P.ProductSKey = pt.ProductSKey
WHERE p.ID = '15'
GROUP BY p.ID, PIT.Code, PT.Year

returns the following:

Table with Codes

I have reconfigured my script to add aggregate and group the codes by id and year, however duplicates are spotted. Code and output below:

SELECT 
      DISTINCT(p.ID) AS ID
    , string_agg(PIT.Code, ',') AS Code
    , year(PT.Date) AS Year
FROM fact.PreT PT
INNER JOIN dim.ProdIType PIT 
    ON PIT.ProdITypeSKey = PT.ProdITypeSKey
INNER JOIN dim.Proudct P
    ON P.ProductSKey = pt.ProductSKey
WHERE p.ID = '15'
GROUP BY p.ID, PT.Year

Result:

Incorrect Output

Desired output - distinct and ordered code ascending:

enter image description here

Can someone explain why string_acc is duplicating codes? how should I tackle this issue?

Upvotes: 2

Views: 10234

Answers (1)

Charlieface
Charlieface

Reputation: 71544

You need to subquery it and group again. Note that DISTINCT is not a function, it acts over the whole resultset, and is the same as grouping by all column.

SELECT 
      ID
    , string_agg(Code, ',') AS Code
    , [Year]
FROM (
    SELECT
          p.ID
        , PIT.Code AS Code
        , year(PT.Date) AS Year
    FROM fact.PreT PT
    INNER JOIN dim.ProdIType PIT 
        ON PIT.ProdITypeSKey = PT.ProdITypeSKey
    INNER JOIN dim.Proudct P
        ON P.ProductSKey = pt.ProductSKey
    WHERE p.ID = '15'
    GROUP BY p.ID, year(PT.Date), PIT.Code
) p
GROUP BY p.ID, PT.Year;

Upvotes: 4

Related Questions