batigol999
batigol999

Reputation: 1

Merging distinct rows to one to many structure

I need to migrate data to new database which have different structure.

Source data is similar to this:

ID          Name       Surname    IDNumber Passport Mobile Email
----------- ---------- ---------- -------- -------- ------ ---------------
1           Nikola     Jokic      I123     P123     NULL   NULL
2           Nikola     Jokic      I123     NULL     M123   [email protected]
3           Nikola     Jokic      NULL     P123     NULL   [email protected]
4           Nikola     Jokic      NULL     NULL     NULL   NULL
5           Milos      Teodosic                            [email protected]
6           Milos      Teodosic            P333     M111   
7           Milos      Teodosic            P222     M111   [email protected]
8           Nikola     Jokic                               



Problem is the fact that there is no single column i can use as connection between rows.

For example: Rows with ids 1 and 2 are connected by ID number, 1 and 3 by Passport, so rows with id 1,2 and 3 are single customer. Mobile or email can be connection too. Row with id 4 has nothing but name and surname, so it can't be connected with any other, neither with 8.

I already tried using cursor, but the problem is number of rows in source data which is more than 100k and it is extremely slow.

Also, i was thinking about MERGE statement, but the problem is that the source is one table, but destination is contained of two tables which is not supported.

CREATE TABLE [dbo].[Source](
    [ID] [int] NOT NULL IDENTITY,
    [Name] [nvarchar](50) NOT NULL,
    [Surname] [nvarchar](50) NOT NULL,
    [IDNumber] [nvarchar](50) NULL,
    [Passport] [nvarchar](50) NULL,
    [Mobile] [nvarchar](50) NULL,
    [Email] [nvarchar](50) NULL
) 
GO

CREATE TABLE [dbo].[Customer](
    [ID] [INT] IDENTITY(1,1) NOT NULL,
    [Name] [NVARCHAR](50) NULL,
    [Surname] [NVARCHAR](50) NULL,
    [IDNumber] [NVARCHAR](50) NULL
)
GO

CREATE TABLE [dbo].[CustomerInfo](
    [ID] [int] IDENTITY(1,1) NOT NULL,
    [CustomerId] [int] NOT NULL,
    [InfoType] [nvarchar](50) NOT NULL,
    [Value] [nvarchar](50) NOT NULL
)
GO



--this is desired result



ID          Name       Surname    IDNumber CustomerInfoID InfoType   Value
----------- ---------- ---------- -------- -------------- ---------- ---------------
1           Nikola     Jokic      I123     1              Passport   P123           
1           Nikola     Jokic      I123     2              Mobile     M123           
1           Nikola     Jokic      I123     3              Email      [email protected] 
1           Nikola     Jokic      I123     4              Email      [email protected]
4           Nikola     Jokic      NULL     NULL           NULL       NULL
5           Milos      Teodosic   NULL     5              Passport   P333           
5           Milos      Teodosic   NULL     6              Passport   P222           
5           Milos      Teodosic   NULL     6              Mobile     M111           
5           Milos      Teodosic   NULL     7              Email      [email protected]   
8           Nikola     Jokic      NULL     NULL                                     

Upvotes: 0

Views: 44

Answers (1)

Max
Max

Reputation: 7100

Here is your first step:

with sourcedata 
as
(
    select Id, Name, Surname, IdNumber, u.InfoType, u.Value
    from Stack.Source S
    Unpivot (
        Value for InfoType in ([Passport],[Mobile],[Email])
    ) u
)

select distinct 
        DENSE_RANK() over (order by Name, Surname) as TheNewId
       ,Name
       ,Surname
       ,IdNumber
       ,InfoType
       ,Value 
from sourcedata

I have created a simple CTE for display purpouse only but you can change it to a (Indexed) view for performance.

From here you can run INSERT INTO .. SELECT for the two tables.

Upvotes: 1

Related Questions