SirFaps
SirFaps

Reputation: 11

How do I merge and delete duplicated rows in SQL using UPDATE?

For example, I have a table of:

id | code | name | type | deviceType
---+------+------+------+-----------
1  | 23   | xyz  | 0    | web
2  | 23   | xyz  | 0    | mobile
3  | 24   | xyzc | 0    | web
4  | 25   | xyzc | 0    | web

I want the result to be:

id | code | name | type | deviceType
---+------+------+------+-----------
1  | 23   | xyz  | 0    | web&mobile
2  | 24   | xyzc | 0    | web
3  | 25   | xyzc | 0    | web

How do I do this in SQL Server using UPDATE and DELETE statements?

Any help is greatly appreciated!

Upvotes: 0

Views: 78

Answers (2)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 522396

I might actually suggest just leaving the original data intact, and instead creating a view here:

CREATE VIEW yourView AS
SELECT ROW_NUMBER() OVER (ORDER BY MIN(id)) AS id,
       code, name, type,
       STRING_AGG(deviceType, '&') WITHIN GROUP (ORDER BY id) AS deviceType
FROM yourTable
GROUP BY code, name, type;

screen capture from demo link below

Demo

One main reason for not actually doing the update is that every time new data comes in, you might possibly have to run that update, over and over. Instead, just keeping the original data and running the view occasionally might perform better here.

Note that I assume that you are using SQL Server 2017 or later. If not, then STRING_AGG would have to be replaced with an uglier approach, but you should consider upgrading in this case.

Upvotes: 1

GMB
GMB

Reputation: 222622

To do what you want, you would need two separate statements.

This updates the "first" row of each group with all the device types in the group:

update t
set t.devicetype = t1.devicetype
from mytable t
inner join (
    select min(id) as id, string_agg(devicetype, '&') within group(order by id) as devicetype
    from mytable
    group by code, name, type
    having count(*) > 1
) t1 on t1.id = t.id

This deletes everything but the first row per group:

with t as (
    select row_number() over(partition by code, name, type order by id) rn
    from mytable
) 
delete from t where rn > 1

Demo on DB Fiddle

Upvotes: 0

Related Questions