Reputation: 57
I need to transform multiple similar records into a single record. There can be up to 10 rows that need to be combined. Each group of rows that need to be combined have the same ID. And the values of the row data are irrelevant (will actually be a GUID). Data looks like this:
Table A
ID C1 C2 C3
ID1 x x x
ID1 y y y
ID2 y y y
ID2 x x x
ID2 y y y
ID2 y y y
ID3 x x x
ID3 y y y
ID3 y y y
I Need to transform to this structure and only have one record per ID. There can be N number of columns based upon the number of records that have the same ID (about 10).
Table B
ID C1 C2 C3 C1A C2A C3A C1B C2B C3B
ID1 x x x y y y null null null
ID2 y y y x x x y y y
ID3 x x x y y y y y y
I cannot modify table B at all. Just merge or insert to it.
I am using SQL Server 2008 R2 and the volume of table A is about a million records.
Any help is greatly appriciated.
UPDATE: Adding real table definitions.
Here is the TableA create script:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[IntervalPivotTable](
[UID] [uniqueidentifier] NOT NULL,
[ServiceHash] [int] NULL,
[IntervalID] [nvarchar](50) NULL,
[IntervalTypeID] [nvarchar](50) NULL,
[IntervalGroupID] [nvarchar](50) NULL,
[DrivingConditionID] [nvarchar](50) NULL,
CONSTRAINT [PK_IntervalPivotTable] PRIMARY KEY CLUSTERED
(
[UID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[IntervalPivotTable] ADD CONSTRAINT [DF_IntervalPivotTable_UID] DEFAULT (newid()) FOR [UID]
GO
Here is the TableB create script: SET ANSI_NULLS ON GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[PivotedIntervals](
[ServiceHash] [int] NULL,
[IntervalID_0] [nvarchar](50) NULL,
[IntervalTypeID_0] [nvarchar](50) NULL,
[IntervalGroupID_0] [nvarchar](50) NULL,
[DrivingConditionID_0] [nvarchar](50) NULL,
[IntervalID_1] [nvarchar](50) NULL,
[IntervalTypeID_1] [nvarchar](50) NULL,
[IntervalGroupID_1] [nvarchar](50) NULL,
[DrivingConditionID_1] [nvarchar](50) NULL,
[IntervalID_2] [nvarchar](50) NULL,
[IntervalTypeID_2] [nvarchar](50) NULL,
[IntervalGroupID_2] [nvarchar](50) NULL,
[DrivingConditionID_2] [nvarchar](50) NULL,
[IntervalID_3] [nvarchar](50) NULL,
[IntervalTypeID_3] [nvarchar](50) NULL,
[IntervalGroupID_3] [nvarchar](50) NULL,
[DrivingConditionID_3] [nvarchar](50) NULL,
[IntervalID_4] [nvarchar](50) NULL,
[IntervalTypeID_4] [nvarchar](50) NULL,
[IntervalGroupID_4] [nvarchar](50) NULL,
[DrivingConditionID_4] [nvarchar](50) NULL,
[IntervalID_5] [nvarchar](50) NULL,
[IntervalTypeID_5] [nvarchar](50) NULL,
[IntervalGroupID_5] [nvarchar](50) NULL,
[DrivingConditionID_5] [nvarchar](50) NULL,
[IntervalID_6] [nvarchar](50) NULL,
[IntervalTypeID_6] [nvarchar](50) NULL,
[IntervalGroupID_6] [nvarchar](50) NULL,
[DrivingConditionID_6] [nvarchar](50) NULL,
[IntervalID_7] [nvarchar](50) NULL,
[IntervalTypeID_7] [nvarchar](50) NULL,
[IntervalGroupID_7] [nvarchar](50) NULL,
[DrivingConditionID_7] [nvarchar](50) NULL,
[IntervalID_8] [nvarchar](50) NULL,
[IntervalTypeID_8] [nvarchar](50) NULL,
[IntervalGroupID_8] [nvarchar](50) NULL,
[DrivingConditionID_8] [nvarchar](50) NULL,
[IntervalID_9] [nvarchar](50) NULL,
[IntervalTypeID_9] [nvarchar](50) NULL,
[IntervalGroupID_9] [nvarchar](50) NULL,
[DrivingConditionID_9] [nvarchar](50) NULL,
[IntervalID_10] [nvarchar](50) NULL,
[IntervalTypeID_10] [nvarchar](50) NULL,
[IntervalGroupID_10] [nvarchar](50) NULL,
[DrivingConditionID_10] [nvarchar](50) NULL
) ON [PRIMARY]
GO
Upvotes: 0
Views: 605
Reputation: 19356
You might want to try with unpivot/pivot combination. Unpivot transforms TableA into rows with three columns only, id, ColumnID and guid, for example
ID1, C1_0, x
ID1, C2_0, x
ID1, C3_0, x
ID1, C1_1, x
ID1, C2_1, x
ID1, C3_1, x
adding unique number to each id/column combination (row_number over (...) part), and does that for each row in TableA. Pivot will transform them into single row by id:
select *
from
(
select id,
code
+ '_'
+ convert(varchar(10), ColumnID) - 1 ColumnID,
guid
from
(
select TableA.*,
row_number() over (partition by ID
order by TableA_PK) - 1 ColumnID
from TableA
)
unpivot
(
guid for code in (c1, c2, c3)
) as u
) UnpivotedTable
pivot
(
min(guid)
for columnid in ([c1_0], [c2_0], [c3_0], [c1_1], [c2_1], [c3_1])
) PivotedTable
Don't be alarmed by presence of min(guid). It is here because pivot insist on aggregate function. Since there is only one guid per ID/ColumnID combination, there will be no missing values. To verify, replace min(guid) with count (guid) and check for values greater than 1.
UPDATE: to avoid mixing different rows from same ID I had to change order by clause of row_number() to order by primary key of TableA. If table has no PK, one should transform input so that it includes row_number() as a surrogate primary key:
(
select TableA.*, row_number() over (order by ID) TableA_PK
from TableA
) TableAWithAPrimaryKey
and use derived table instead of original table.
UPDATE 2:
My mistake was placing row_number() in unpivot select. Row numbers for ID have to be fetched before transformation, obviously. This is a pivot that uses original tables:
select *
from
(
select ServiceHash,
code
+ '_'
+ convert(varchar(10), ColumnID) ColumnID,
guid
from
(
select IntervalPivotTable.*,
row_number() over (partition by ServiceHash
order by UID) - 1 ColumnID
from IntervalPivotTable
) a
unpivot
(
guid for code in
(
IntervalID,
IntervalTypeID,
IntervalGroupID,
DrivingConditionID
)
) as u
) UnpivotedTable
pivot
(
min(guid)
for columnid in
(
IntervalID_0,
IntervalTypeID_0,
IntervalGroupID_0,
DrivingConditionID_0,
IntervalID_1,
IntervalTypeID_1,
IntervalGroupID_1,
DrivingConditionID_1,
IntervalID_2,
IntervalTypeID_2,
IntervalGroupID_2,
DrivingConditionID_2
-- Continue list of pivoted columns up to _10 here
)
) PivotedTable
Upvotes: 3