aw1975
aw1975

Reputation: 1708

T-SQL to fix duplicate data in table

The table below represents a clean set of data. Notice that for each item, the Priority value is a unique integer between 1 and the number of occurrences of that item:

ItemKey    ItemName  Priority  
1           Item1      1      
2           Item1      2      
3           Item1      3      
4           Item2      1      
5           Item3      1 
6           Item3      2
7           Item4      1
8           Item4      2

Unfortunately there is a bug in our application that has resulted in duplicate priorities for some items. E.g. in the following table, Item1 and Item4 both have duplicate priorities.

ItemKey    ItemName  Priority  
1           Item1      1      
2           Item1      1      
3           Item1      2      
4           Item2      1      
5           Item3      1 
6           Item3      2
7           Item4      1
8           Item4      1

I would appreciate some guidance on how to write a T-SQL script to cleanse the above data. The script should determine those items that have duplicate priorities, and updates the priorities for the offending items so that they are unique for that item, and have value between 1 and number of occurrences of the item. After cleansing the data, the number of rows should remain unchanged.

I only have experience with PL/SQL so am not certain what is the best approach to implement this using T-SQL.

Thank you for your help.

Upvotes: 1

Views: 70

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269503

In your particular example, the priority is directly related to ItemKey. If this is not the case, a more general approach is:

with toupdate as (
      select t.*,
             row_number() over (partition by ItemName
                                order by Priority, ItemKey
                               ) as new_priority
      from table t
     )
update toupdate
    set priority = newpriority
    where priority <> newpriority;

Upvotes: 2

Giorgi Nakeuri
Giorgi Nakeuri

Reputation: 35780

With window function:

;with cte as(
select *, row_number() over(partition by ItemName order by ItemKey) as rn from Table)
update cte set Priority = rn

Upvotes: 3

Related Questions