Koppula
Koppula

Reputation: 95

How do we aggregate string data types in MSSQL query

I have SQL table like below

enter image description here

need to query the data in either of below format. Please help me to build the MS SQL query to get results in desired format.

enter image description here

Below is the SQL scripts to create table/insert the records.

    create table Table1 (ID varchar(50),Col1 varchar(50),Col2 varchar(50),Col3 varchar(50),Col4 varchar(50),Col5 varchar(50),Col6 varchar(50))
go
insert into Table1
select '1-117D',null,null,'DDDD',null,null,'[email protected]' union all 
select '1-117D',null,'CCCC',null,null,'[email protected]',null union all 
select '1-117D','AAAA',null,null,'[email protected]',null,null union all 
select '1-117D','BBBB',null,null,'[email protected]',null,null 
go
select * from Table1

Upvotes: 0

Views: 106

Answers (3)

lptr
lptr

Reputation: 6788

select distinct 
    ID,
    Col1 = isnull(Col1, first_value(Col1) over (partition by ID order by case when Col1 is null then 1 else 0 end, Col1)),
    Col2 = isnull(Col2, first_value(Col2) over (partition by ID order by case when Col2 is null then 1 else 0 end, Col2)),
    Col3 = isnull(Col3, first_value(Col3) over (partition by ID order by case when Col3 is null then 1 else 0 end, Col3)),
    Col4 = isnull(Col4, first_value(Col4) over (partition by ID order by case when Col4 is null then 1 else 0 end, Col4)),
    Col5 = isnull(Col5, first_value(Col5) over (partition by ID order by case when Col5 is null then 1 else 0 end, Col5)),
    Col6 = isnull(Col6, first_value(Col6) over (partition by ID order by case when Col6 is null then 1 else 0 end, Col6))
from Table1;

Upvotes: 0

JoeFletch
JoeFletch

Reputation: 3960

I would also like to reiterate the fact that you may want to normalize the structure of your data from a horizontal format to a vertical format to allow for data scaling.

But outside of that feedback you can try this to get what you have displayed in result set 1. You would need to test this for the remaining dataset to ensure this is what you want the code to do.

IF OBJECT_ID(N'tempdb..#t') IS NOT NULL DROP TABLE #t
CREATE TABLE #t (ID varchar(50),Col1 varchar(50),Col2 varchar(50),Col3 varchar(50),Col4 varchar(50),Col5 varchar(50),Col6 varchar(50))

GO

INSERT INTO #t
SELECT '1-117D',NULL,NULL,'DDDD',NULL,NULL,'[email protected]' UNION ALL 
SELECT '1-117D',NULL,'CCCC',NULL,NULL,'[email protected]',NULL UNION ALL 
SELECT '1-117D','AAAA',NULL,NULL,'[email protected]',NULL,NULL UNION ALL 
SELECT '1-117D','BBBB',NULL,NULL,'[email protected]',NULL,NULL 

GO

;WITH n1 AS (
    SELECT
        ID,--Columns that are the same for each row
        Col1,Col2,Col3,Col4,Col5,Col6
    FROM #t AS t
),
n2 AS (
SELECT ID,FieldName,Value, ROW_NUMBER() OVER (PARTITION BY ID,FieldName ORDER BY Value) AS RowNumber
FROM n1
UNPIVOT ( Value FOR FieldName IN (Col1,Col2,Col3,Col4,Col5,Col6) ) AS pvt1
)
SELECT ID,Col1,Col2,Col3,Col4,Col5,Col6
FROM n2
PIVOT (
    MAX(Value) 
    FOR FieldName IN (
        Col1,Col2,Col3,Col4,Col5,Col6
        )
) AS pivot_table;

GO

DROP TABLE #t

Upvotes: 1

Thom A
Thom A

Reputation: 95571

This is really ugly, but it works based on the sample data we have.

WITH RNs AS(
    SELECT T1.ID,
           T1.Col1,
           T1.Col2,
           T1.Col3,
           T1.Col4,
           T1.Col5,
           T1.Col6,
           CASE WHEN T1.Col1 IS NOT NULL THEN ROW_NUMBER() OVER (PARTITION BY T1.ID, CASE WHEN T1.Col1 IS NOT NULL THEN 0 ELSE 1 END ORDER BY T1.Col1) END AS Col1RN,
           CASE WHEN T1.Col2 IS NOT NULL THEN ROW_NUMBER() OVER (PARTITION BY T1.ID, CASE WHEN T1.Col2 IS NOT NULL THEN 0 ELSE 1 END ORDER BY T1.Col2) END AS Col2RN,
           CASE WHEN T1.Col3 IS NOT NULL THEN ROW_NUMBER() OVER (PARTITION BY T1.ID, CASE WHEN T1.Col3 IS NOT NULL THEN 0 ELSE 1 END ORDER BY T1.Col3) END AS Col3RN,
           CASE WHEN T1.Col4 IS NOT NULL THEN ROW_NUMBER() OVER (PARTITION BY T1.ID, CASE WHEN T1.Col4 IS NOT NULL THEN 0 ELSE 1 END ORDER BY T1.Col4) END AS Col4RN,
           CASE WHEN T1.Col5 IS NOT NULL THEN ROW_NUMBER() OVER (PARTITION BY T1.ID, CASE WHEN T1.Col5 IS NOT NULL THEN 0 ELSE 1 END ORDER BY T1.Col5) END AS Col5RN,
           CASE WHEN T1.Col6 IS NOT NULL THEN ROW_NUMBER() OVER (PARTITION BY T1.ID, CASE WHEN T1.Col6 IS NOT NULL THEN 0 ELSE 1 END ORDER BY T1.Col6) END AS Col6RN
    FROM dbo.Table1 T1)
SELECT R.ID,
       MAX(Col1) AS Col1,
       MAX(Col2) AS Col2,
       MAX(Col3) AS Col3,
       MAX(Col4) AS Col4,
       MAX(Col5) AS Col1,
       MAX(Col6) AS Col6
FROM RNs R
GROUP BY R.ID,
         COALESCE(R.Col1RN, R.Col2RN, R.Col3RN),
         COALESCE(R.Col4RN, R.Col5RN, R.Col6RN);

Like I mention in the comment though, the data you supply suggest you have some significant denormalisation issues, and you should really be working to fix that. If, for example Columns 1-3 represent a name, and 4-6 an email, then there should just be 2 columns, a name column and an email column (and perhaps some kind of further ID). Certainly not 3, with NULL values in the columns that aren't completed, and when you "run out" of columns start reusing it again.

Unfortunately, there's not enough information for me to correctly guess what your data should really look like (the columns having no names, for example, means that I don't know if they do represent the same thing or not). I would, however, consider taking a step back and reconsidering your design here anyway.

Upvotes: 1

Related Questions