Shahbaz Hussain
Shahbaz Hussain

Reputation: 9

Deleting duplicated column where id is the same

I am creating a stored procedure for a client. However, I have table which has duplicate rows with only one different as shown below:

Id          LastName      Total_Pay    Holiday_Pay  Total_Earning
617A06      KAZMIEK          0            139.45    139.45
617A06      KAZMIEK        284.3          139.45    423.75
617A0       NIZIO          172             0        172
00178       SZWAJKOWSKA     0            181.03     181.03

Now i have used the following code to see which row is duplicating:

WITH CTE AS 
(SELECT *,RN=ROW_NUMBER() OVER(PARTITION BY tax_period, id ORDER BY tax_period) 
 FROM #worker_totals)
select  *
FROM CTE

and it gives me the right results:

id           name      total_pay     holiday_pay     total_earning   rn
617A06      KAZMIEK       0           139.45            139.45        1
617A06      KAZMIEK      284.3        139.45            423.75        2
617A0       NIZIO        172            0               172           1
00178       SZWAJKOWSKA   0          181.03            181.03         1

Now the first row is wrong because the total pay should be 284.3 and should be repeated for the same person. The same has happened for over 60 records. How do i delete the rn 1 for the same person who has rn2 because rn1 is wrong wen the same person has rn2.

I have tried the following:

Delete From cte
Where rn > 1;

but i's not what i want. I want to delete the rn =1 when there is rn =2 for the same id.

Upvotes: 0

Views: 60

Answers (2)

Mark Sinkinson
Mark Sinkinson

Reputation: 976

You're currently ordering by tax_period ASC (the ascending part is implicit as no direction has been specified). Therefore the 'later' tax period, which is likely to be more correct than the earlier, is always row 2.

If you instead specify a DESC in your ORDER BY, then the latest available tax period is always going to be Row 1. This means you should be able to delete anything rn > 1.

For records with only 1 period, you'll delete nothing. For records with multiple periods, you'll delete all except the latest available.

WITH CTE AS 
(
  SELECT 
    *,
    RN=ROW_NUMBER() OVER(PARTITION BY tax_period, id ORDER BY tax_period DESC) 
  FROM #worker_totals
)
DELETE *
FROM CTE 
WHERE RN > 1;

Upvotes: 2

user4219031
user4219031

Reputation: 307

  WITH CTE AS 
  (SELECT *,RN=ROW_NUMBER() OVER(PARTITION BY tax_period, id ORDER BY tax_period) 
  FROM #worker_totals),
  CTEMoreThanOne AS (SELECT * FROM CTE WHERE rn>1)--or=2
  DELETE c1 FROM
  CTE c1 INNER JOIN CTEMoreThanOne c2 ON c1.Id = c2.Id
  WHERE c1.rn=1

Upvotes: 0

Related Questions