Derek Foulk
Derek Foulk

Reputation: 1902

Merge Multiple Rows Based On Unique Identifier in SQL Server (GROUP_CONCAT for SQL Server)?

I was initially trying to figure this out via PHP, however I have not had much luck...

How Can I Merge All Duplicates In Array Based On One Key's Value?

Since I have not found a solution, I decided to try to solve my issue via my SQL query. What I need to know is how can I "merge" the differences between the returned rows in this query?

SELECT
  Item.ID,
  Item.ItemLookupCode,
  nitroasl_pamtable.ManufacturerPartNumber,
  SupplierList.ReorderNumber,
  Item.Notes,
  Item.Description,
  Item.ExtendedDescription,
  Item.Quantity,
  nitroasl_pamtable.SpoofStock,
  Item.Price,
  nitroasl_pamtable.PAM_Keywords
FROM 
   Item
JOIN 
   nitroasl_pamtable ON Item.ID = nitroasl_pamtable.ItemID
JOIN 
   SupplierList ON Item.ID = SupplierList.ItemID
WHERE 
   (Item.ItemLookupCode LIKE '%tp-ac1750%' AND Price > 0.00 AND WebItem = 1)
   OR 
   (nitroasl_pamtable.ManufacturerPartNumber LIKE '%tp-ac1750%' 
    AND Price > 0.00 AND WebItem = 1)
   OR 
   (SupplierList.ReorderNumber LIKE '%tp-ac1750%' AND Price > 0.00 
    AND WebItem = 1)
   OR 
   (Item.Notes LIKE '%tp-ac1750%' AND Price > 0.00 AND WebItem = 1)
   OR 
   (Item.Description LIKE '%tp-ac1750%' AND Price > 0.00 AND WebItem = 1)
   OR 
   (Item.ExtendedDescription LIKE '%tp-ac1750%' AND Price > 0.00 
    AND WebItem = 1)
   OR 
   (nitroasl_pamtable.PAM_Keywords LIKE '%tp-ac1750%' AND Price > 0.00 
    AND WebItem = 1)
ORDER BY 
    Item.ItemLookupCode ASC;

What I think I need (but haven't successfully implemented)

MySQL's GROUP_CONCAT equivalent

I believe this function would do what I need, but I am using SQL Server - not MySQL. And I cannot seem to get the posted solutions on how to do this to work for me...

What I've tried:

Recently, I tried the MAX() and the GROUP BY function (together), but it picks the MAX value that is returned in the duplicate rows, thus returning a single row with the MAX values in each column.

SELECT
    MAX(Item.ID) AS Id,
    Item.ItemLookupCode,
    MAX(nitroasl_pamtable.ManufacturerPartNumber) AS ManufacturerPartNumber,
    MAX(SupplierList.ReorderNumber) AS ReorderNumber,
    MAX( CAST(Item.Notes AS varchar(max)) ) AS Notes,
    MAX(Item.Description) AS Description,
    MAX( CAST(Item.ExtendedDescription AS varchar(max)) ) AS ExtendedDescription,
    MAX(Item.Quantity) AS Quantity,
    MAX(nitroasl_pamtable.SpoofStock) AS SpoofStock,
    MAX(Item.Price) AS Price,
    MAX(nitroasl_pamtable.PAM_Keywords) AS PAM_Keywords,
    MAX(Item.PictureName) AS PictureName
FROM 
    Item
LEFT JOIN 
    nitroasl_pamtable ON Item.ID = nitroasl_pamtable.ItemID
LEFT JOIN 
    SupplierList ON Item.ID = SupplierList.ItemID
WHERE 
    (Item.ItemLookupCode LIKE '%tp-ac1750%' AND Price > 0.00 AND WebItem = 1)
    OR (nitroasl_pamtable.ManufacturerPartNumber LIKE '%tp-ac1750%' AND Price > 0.00 AND WebItem = 1)
    OR (SupplierList.ReorderNumber LIKE '%tp-ac1750%' AND Price > 0.00 AND WebItem = 1)
    OR (Item.Notes LIKE '%tp-ac1750%' AND Price > 0.00 AND WebItem = 1)
    OR (Item.Description LIKE '%tp-ac1750%' AND Price > 0.00 AND WebItem = 1)
    OR (Item.ExtendedDescription LIKE '%tp-ac1750%' AND Price > 0.00 AND WebItem = 1)
    OR (nitroasl_pamtable.PAM_Keywords LIKE '%tp-ac1750%' AND Price > 0.00 AND WebItem = 1)
GROUP BY 
    Item.ItemLookupCode
ORDER BY 
    Item.ItemLookupCode ASC

Instead of discarding the variants of each column, I would like put all the returned values of each column (that are discarded with MAX) into their respective/original columns separated by a comma...

What I need:

Database Schema (Sample)

In the file above, you will see the four rows returned by the above SQL query. I would like to have one row returned that looks like this:

ID:

8265

ItemLookupCode:

TP-AC1750

ManufacturerPartNumber:

Archer C7

ReorderNumber:

7681617, ARCHERC7, N82E16833704177

Notes:

TP-LINK Archer C7 AC1750 Routr

Description:

TP-LINK Archer C7 AC1750 Routr

ExtendedDescription:

TP-Link Archer C7 Wireless-AC1750 Dual-Band Gigabit Router

Quantity:

0 (This should actually be a combined sum/total of the values in this column)

SpoofStock:

NULL (Same as Quantity - Should be sum / This value is different than Quantity)

Price:

129.95

PAM_Keywords:

NULL

I know there is a better way to write this query. I am just not an SQL guy. This query/script is a keyword search that returns items in our Microsoft Dynamics RMS database, and outputs JSON that I use to create a list of products that can be changed and resubmitted to the DB. I use SQL Server 2008 R2 (if that matters). Any advice on how I can accomplish the above output using some variation of my query would be greatly appreciated! Thanks

Update (SQLFiddle)

Here is a link to an SQLFiddle to play around with :)

SQLFiddle with No MAX Function

SQLFiddle with MAX Function (Not a viable solution as I lose data)

Upvotes: 2

Views: 1672

Answers (2)

ErikE
ErikE

Reputation: 50271

This will get you started, but there is still some uncertainty around the nitroasl_pamtable table, so I didn't include that.

SELECT
  I.ID,
  I.ItemLookupCode,
  I.Notes,
  I.Description,
  I.ExtendedDescription,
  I.Quantity,
  I.Price,
  SL.ReorderNumbers,
  P.SpoofStock,
  P.ManufacturerPartNumber,
  P.PAM_Keywords
FROM
  Item I
  LEFT JOIN nitroasl_pamtable P
    ON I.ID = P.ItemID
  OUTER APPLY (
    SELECT
      ReorderNumbers = Substring((
        SELECT DISTINCT Convert(varchar(max), ', ' + SL.ReorderNumber)
        FROM SupplierList SL
        WHERE I.ID = SL.ItemID
        FOR XML PATH(''), TYPE
      ).value('.[1]', 'varchar(max)'), 3, 2147483647)
  ) SL
WHERE
  I.Price > 0.00
  AND I.WebItem = 1
  AND (
    I.ItemLookupCode LIKE '%tp-ac1750%'
    OR I.Notes LIKE '%tp-ac1750%'
    OR I.Description LIKE '%tp-ac1750%'
    OR I.ExtendedDescription LIKE '%tp-ac1750%'
    OR P.ManufacturerPartNumber LIKE '%tp-ac1750%'
    OR P.PAM_Keywords LIKE '%tp-ac1750%'
    OR EXISTS (
      SELECT *
      FROM dbo.SupplierList SL2
      WHERE
        I.ID = SL2.ItemID
        AND SL2.ReorderNumber LIKE '%tp-ac1750%'
    )
  )
ORDER BY
  I.ItemLookupCode ASC;

To bring in nitroasl_pamtable correctly, for every column you want to concatenate, you could do a new OUTER APPLY. You can do a single OUTER APPLY to get at once all columns that need normal aggregation (such as Sum()).

However, I would like to offer that this concatenation will obscure the data in a way that could lead to incorrect assessment or decisions. Pulling in 3 values from a table and concatenating/summing them will make them appear to be a single unit, which may not be correct.

Another way that the concatenating may be harmful is with the reorder numbers. Notice that I put a DISTINCT in there because of the two duplicate reorder numbers--but they were from different suppliers. So what good is a reorder number apart from the supplier it can be sourced from? What if two different items have the same reorder number at different suppliers? (E.g., reorder number BIGBOX is a TV at one supplier, but it's a giant cardboard box at the other one.)

I am not convinced that it is a good idea to concatenate these values in the query. Instead, the UI should be presented the queries separately (the Items as one rowset, then the supporting data from each other table as individual rowsets) and then present the data in a way that makes sense in the UI.

Upvotes: 1

Solomon Rutzky
Solomon Rutzky

Reputation: 48874

If I understand this correctly, it looks like you just need to concatenate the ReorderNumber field. You could use a SQLCLR User-Defined Aggregate (UDA) to do this rather simply.

There is a pre-done UDA to do this called Agg_Join in the SQL# library of SQLCLR functions, stored procedures, etc (that I am the author of, but this aggregate function is available in the Free version). Using it would make your query look as follows:

SELECT
    MAX(Item.ID) AS Id,
    Item.ItemLookupCode,
    MAX(nitroasl_pamtable.ManufacturerPartNumber) AS ManufacturerPartNumber,
    SQL#.Agg_Join(SupplierList.ReorderNumber) AS ReorderNumber,
    MAX( CAST(Item.Notes AS varchar(max)) ) AS Notes,
    MAX(Item.Description) AS Description,
    MAX( CAST(Item.ExtendedDescription AS varchar(max)) ) AS ExtendedDescription,
    MAX(Item.Quantity) AS Quantity,
    MAX(nitroasl_pamtable.SpoofStock) AS SpoofStock,
    MAX(Item.Price) AS Price,
    MAX(nitroasl_pamtable.PAM_Keywords) AS PAM_Keywords,
    MAX(Item.PictureName) AS PictureName
FROM   Item
LEFT JOIN   nitroasl_pamtable
       ON Item.ID = nitroasl_pamtable.ItemID
LEFT JOIN  SupplierList
       ON Item.ID = SupplierList.ItemID
WHERE 
    (Item.ItemLookupCode LIKE '%tp-ac1750%' AND Price > 0.00 AND WebItem = 1)
    OR (nitroasl_pamtable.ManufacturerPartNumber LIKE '%tp-ac1750%'
         AND Price > 0.00
         AND WebItem = 1)
    OR (SupplierList.ReorderNumber LIKE '%tp-ac1750%'
          AND Price > 0.00
          AND WebItem = 1)
    OR (Item.Notes LIKE '%tp-ac1750%' AND Price > 0.00 AND WebItem = 1)
    OR (Item.Description LIKE '%tp-ac1750%' AND Price > 0.00 AND WebItem = 1)
    OR (Item.ExtendedDescription LIKE '%tp-ac1750%' AND Price > 0.00 AND WebItem = 1)
    OR (nitroasl_pamtable.PAM_Keywords LIKE '%tp-ac1750%' AND Price > 0.00 AND WebItem = 1)
GROUP BY      Item.ItemLookupCode
ORDER BY      Item.ItemLookupCode ASC;

Just to mention, the Full version of SQL# includes a more functional version of Agg_Join called Agg_JoinPlus that allows for sorting, filtering out duplicates, replacing NULLs, changing the delimiter, etc.

Or, if you want to create this on your own, in which case you can customize the functionality, I wrote an article showing an example of creating a User-Defined Aggregate that would just need to be altered a little to do string concatenation: Getting The Most Out of SQL Server 2005 UDTs and UDAs (free registration is required). That was written prior to SQL Server 2008 coming out that had the ability to set MaxSize to -1 so that it can store more than 8000 bytes at a time (which is more of an issue for this type of operation than it is for many arithmetic operations).

Another option that does not require subscriptions and should work out of the box (I haven't tried it myself) is this open-source project:

GROUP_CONCAT string aggregate for SQL Server

This project has not been updated since 2013-05-09, but I suspect that it would do what you want and would fit into your query just like SQL#.Agg_Join and any other aggregate. There is an install script, GroupConcatInstallation.sql, in the Installation Scripts folder that contains the Assembly and T-SQL wrapper objects.

And yes, all of these aggregates should work with PHP or anything else since they are part of the query and hence it is SQL Server that does the processing and it has nothing to do with any particular client software.

Upvotes: 0

Related Questions