sreeli
sreeli

Reputation: 57

Transform data from many record to single record in sql server

I need to transform multiple similar records into a single record. There can be up to 10 rows that need to be combined. Each group of rows that need to be combined have the same ID. And the values of the row data are irrelevant (will actually be a GUID). Data looks like this:

Table A

ID    C1   C2    C3  
ID1   x    x     x  
ID1   y    y     y  
ID2   y    y     y  
ID2   x    x     x  
ID2   y    y     y  
ID2   y    y     y  
ID3   x    x     x  
ID3   y    y     y  
ID3   y    y     y  

I Need to transform to this structure and only have one record per ID. There can be N number of columns based upon the number of records that have the same ID (about 10).

Table B

ID     C1     C2     C3     C1A     C2A     C3A     C1B     C2B     C3B
ID1    x      x      x      y       y       y       null    null    null
ID2    y      y      y      x       x       x       y       y       y
ID3    x      x      x      y       y       y       y       y       y

I cannot modify table B at all. Just merge or insert to it.
I am using SQL Server 2008 R2 and the volume of table A is about a million records.
Any help is greatly appriciated.

UPDATE: Adding real table definitions.

Here is the TableA create script:

SET ANSI_NULLS ON
    GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[IntervalPivotTable](
    [UID] [uniqueidentifier] NOT NULL,
    [ServiceHash] [int] NULL,
    [IntervalID] [nvarchar](50) NULL,
    [IntervalTypeID] [nvarchar](50) NULL,
    [IntervalGroupID] [nvarchar](50) NULL,
    [DrivingConditionID] [nvarchar](50) NULL,
 CONSTRAINT [PK_IntervalPivotTable] PRIMARY KEY CLUSTERED 
(
    [UID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  =     ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

GO

ALTER TABLE [dbo].[IntervalPivotTable] ADD  CONSTRAINT [DF_IntervalPivotTable_UID]  DEFAULT         (newid()) FOR [UID]
GO

Here is the TableB create script: SET ANSI_NULLS ON GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[PivotedIntervals](
    [ServiceHash] [int] NULL,
    [IntervalID_0] [nvarchar](50) NULL,
    [IntervalTypeID_0] [nvarchar](50) NULL,
    [IntervalGroupID_0] [nvarchar](50) NULL,
    [DrivingConditionID_0] [nvarchar](50) NULL,
    [IntervalID_1] [nvarchar](50) NULL,
    [IntervalTypeID_1] [nvarchar](50) NULL,
    [IntervalGroupID_1] [nvarchar](50) NULL,
    [DrivingConditionID_1] [nvarchar](50) NULL,
    [IntervalID_2] [nvarchar](50) NULL,
    [IntervalTypeID_2] [nvarchar](50) NULL,
    [IntervalGroupID_2] [nvarchar](50) NULL,
    [DrivingConditionID_2] [nvarchar](50) NULL,
    [IntervalID_3] [nvarchar](50) NULL,
    [IntervalTypeID_3] [nvarchar](50) NULL,
    [IntervalGroupID_3] [nvarchar](50) NULL,
    [DrivingConditionID_3] [nvarchar](50) NULL,
    [IntervalID_4] [nvarchar](50) NULL,
    [IntervalTypeID_4] [nvarchar](50) NULL,
    [IntervalGroupID_4] [nvarchar](50) NULL,
    [DrivingConditionID_4] [nvarchar](50) NULL,
    [IntervalID_5] [nvarchar](50) NULL,
    [IntervalTypeID_5] [nvarchar](50) NULL,
    [IntervalGroupID_5] [nvarchar](50) NULL,
    [DrivingConditionID_5] [nvarchar](50) NULL,
    [IntervalID_6] [nvarchar](50) NULL,
    [IntervalTypeID_6] [nvarchar](50) NULL,
    [IntervalGroupID_6] [nvarchar](50) NULL,
    [DrivingConditionID_6] [nvarchar](50) NULL,
    [IntervalID_7] [nvarchar](50) NULL,
    [IntervalTypeID_7] [nvarchar](50) NULL,
    [IntervalGroupID_7] [nvarchar](50) NULL,
    [DrivingConditionID_7] [nvarchar](50) NULL,
    [IntervalID_8] [nvarchar](50) NULL,
    [IntervalTypeID_8] [nvarchar](50) NULL,
    [IntervalGroupID_8] [nvarchar](50) NULL,
    [DrivingConditionID_8] [nvarchar](50) NULL,
    [IntervalID_9] [nvarchar](50) NULL,
    [IntervalTypeID_9] [nvarchar](50) NULL,
    [IntervalGroupID_9] [nvarchar](50) NULL,
    [DrivingConditionID_9] [nvarchar](50) NULL,
    [IntervalID_10] [nvarchar](50) NULL,
    [IntervalTypeID_10] [nvarchar](50) NULL,
    [IntervalGroupID_10] [nvarchar](50) NULL,
    [DrivingConditionID_10] [nvarchar](50) NULL 
) ON [PRIMARY]

GO

Upvotes: 0

Views: 605

Answers (1)

Nikola Markovinović
Nikola Markovinović

Reputation: 19356

You might want to try with unpivot/pivot combination. Unpivot transforms TableA into rows with three columns only, id, ColumnID and guid, for example

ID1, C1_0, x
ID1, C2_0, x
ID1, C3_0, x
ID1, C1_1, x
ID1, C2_1, x
ID1, C3_1, x

adding unique number to each id/column combination (row_number over (...) part), and does that for each row in TableA. Pivot will transform them into single row by id:

select *
from
(
  select id, 
         code 
         + '_' 
         + convert(varchar(10), ColumnID) - 1 ColumnID,
         guid
  from 
  (
    select TableA.*,
           row_number() over (partition by ID
                              order by TableA_PK) - 1 ColumnID
      from TableA
  )
  unpivot
  (
    guid for code in (c1, c2, c3)
  ) as u
) UnpivotedTable
pivot
(
  min(guid)
  for columnid in ([c1_0], [c2_0], [c3_0], [c1_1], [c2_1], [c3_1])
) PivotedTable

Don't be alarmed by presence of min(guid). It is here because pivot insist on aggregate function. Since there is only one guid per ID/ColumnID combination, there will be no missing values. To verify, replace min(guid) with count (guid) and check for values greater than 1.

UPDATE: to avoid mixing different rows from same ID I had to change order by clause of row_number() to order by primary key of TableA. If table has no PK, one should transform input so that it includes row_number() as a surrogate primary key:

(
       select TableA.*, row_number() over (order by ID) TableA_PK
         from TableA
) TableAWithAPrimaryKey

and use derived table instead of original table.

UPDATE 2:

My mistake was placing row_number() in unpivot select. Row numbers for ID have to be fetched before transformation, obviously. This is a pivot that uses original tables:

select *
from
(
  select ServiceHash, 
         code 
         + '_' 
         + convert(varchar(10), ColumnID) ColumnID,
         guid
  from 
  (
    select IntervalPivotTable.*,
           row_number() over (partition by ServiceHash 
                            order by UID) - 1 ColumnID
      from IntervalPivotTable
  ) a
  unpivot
  (
    guid for code in 
    (
      IntervalID,
      IntervalTypeID,
      IntervalGroupID,
      DrivingConditionID
    )
  ) as u
) UnpivotedTable
pivot
(
  min(guid)
  for columnid in 
  (
    IntervalID_0,
    IntervalTypeID_0,
    IntervalGroupID_0,
    DrivingConditionID_0,
    IntervalID_1,
    IntervalTypeID_1,
    IntervalGroupID_1,
    DrivingConditionID_1,
    IntervalID_2,
    IntervalTypeID_2,
    IntervalGroupID_2,
    DrivingConditionID_2
 -- Continue list of pivoted columns up to _10 here
  )
) PivotedTable

Upvotes: 3

Related Questions