Reputation: 79

Join Duplicate Rows Into One Row in SQL Server

id name number counter
1 test  010101 2
2 test  010101 1

I need to select the duplicates rows that have the same number-name combination. And I need to update the counter in one of the rows with the sum of the counter in the two rows and then delete the second row

I'm using the below query to select the duplication but I'm not being able to do the update:

SELECT *
FROM [prof]
WHERE ID NOT IN
  (
  SELECT MAX(ID)
  FROM [prof]
  GROUP BY number, name
  )

Upvotes: 0

Answers (4)

SQLChao

Reputation: 7837

Here's a way to do it using a cte. Here's what the cte looks like

id  name    number  counter totalCounter    DupNum
2   test1   10101         1            3         1
1   test1   10101         2            3         2
5   test2   10102         5           12         1
4   test2   10102         4           12         2
3   test2   10102         3           12         3

You can run the whole thing with the delete statement commented out to update the data. Then comment out the update statement and uncomment out the delete and run again.

;WITH table1_cte
AS
(
  SELECT id
    name, 
    number, 
    counter,
    ROW_NUMBER() over(PARTITION BY name, number ORDER BY id DESC) AS DupNum,
    SUM(counter) over (PARTITION BY name, number) AS totalCounter
  FROM prof
)

UPDATE table1_cte
SET counter = totalCounter
WHERE dupnum =1

--DELETE FROM table1_cte
--WHERE dupnum > 1

Upvotes: 1

Oliver

Reputation: 167

This will update the counter and keep the first(lowest) id. It will only keep one unique id so if there are 3 or more rows with the same name, number this will still work.

Update [prof]
set counter=a.cnt
from [prof] p inner join ( 
    select name,number,sum(counter)cnt
    from [prof] 
    group by name,number)a
on p.name=a.name and p.number=a.number;

delete [prof] 
from [prof] join (
     select id,row_number () over (partition by name, number order by id asc)row 
     from [prof])d 
on [prof].id=d.id
where d.row>1;

Upvotes: 1

KingOfAllTrades

Reputation: 421

This will take two statements, best wrapped in a transaction:

update prof
set counter = (select SUM(counter) from prof group by number, name)
where ID in (select MAX(id) from prof group by number, name);

delete from prof where ID not in 
(select MAX(id) from prof group by number, name);

Upvotes: 1

Dan Bracuk

Reputation: 20794

The update would be something like this.

update prof
set counter = counterSum
from prof join
(select name, number, sum(counter) counterSum
from prof
where whatever
group by name, number) temp on prof.name = temp.name and prf.number = temp.number
where whatever

The two "where whatever"s should be the same.

Upvotes: 0

Join Duplicate Rows Into One Row in SQL Server

Answers (4)

Related Questions