Reputation: 21
I've been doing a lot of research but I haven't been able to find a way to do what I want. Consider the following table:
| PK | OrderNumber | ZIP | PhoneNumber
| 1111 | 11-11 | 11111 | 111-111-1111
| 2222 | 22-22 | 22222 | 222-222-2222
| 3333 | 33-33 | 33333 | 333-333-3333
| 4444 | 44-44 | 44444 | 444-444-4444
| 5555 | 11-11 | 11111 | 111-111-1111
| 6666 | 22-22 | 22222 | 222-222-2222
| 7777 | 33-33 | 33333 | 333-333-3333
| 8888 | 44-44 | 44444 | 444-444-4444
| 9999 | 99-99 | 99999 | 999-999-9999
Is there a way to find all duplicates and assign a unique identifier to each group? This is what I would like to see:
| DuplicateID | PK | OrderNumber | ZIP | PhoneNumber
| 1 | 1111 | 11-11 | 11111 | 111-111-1111
| 1 | 5555 | 11-11 | 11111 | 111-111-1111
| 2 | 2222 | 22-22 | 22222 | 222-222-2222
| 2 | 6666 | 22-22 | 22222 | 222-222-2222
| 3 | 3333 | 33-33 | 33333 | 333-333-3333
| 3 | 7777 | 33-33 | 33333 | 333-333-3333
| 4 | 4444 | 44-44 | 44444 | 444-444-4444
| 4 | 8888 | 44-44 | 44444 | 444-444-4444
| 5 | 9999 | 99-99 | 99999 | 999-999-9999
I know how to find all the duplicates and put them in a #Temp table but I don't know how to assign each group a unique identifier.
I apologize in advance for my tables, I wanted to post an image but apparently I need to have more than "10" reputation to be able to do that.
Thank you for any help you can provide
Upvotes: 2
Views: 430
Reputation: 57093
You already have a unique identifier i.e. the compound of OrderNumber
, ZIP
and PhoneNumber
.
Consider that the supposed "unique identifier" PK
has enabled de facto duplicates to appear, rather than prevented them. I don't think you need another such "unique identifier"!
Upvotes: 1
Reputation: 139010
If you use SQL Server 2005 or higher you can use dense_rank().
select dense_rank() over(order by OrderNumber, ZIP, PhoneNumber) as DuplicateID,
*
from YourTable
If you have a DuplicateID
column in your table you want to update you can do like this:
;with C as
(
select DuplicateID,
dense_rank() over(order by OrderNumber, ZIP, PhoneNumber) as rn
from YourTable
)
update C
set DuplicateID = rn
Upvotes: 3
Reputation: 17643
select
dense_rank() over (order by OrderNumber , ZIP, PhoneNumber) as DuplicateID,
pk,
OrderNumber ,
ZIP,
PhoneNumber,
row_number() over (partition by OrderNumber , ZIP, PhoneNumber) as nth_duplicate
from your_table
Upvotes: 2