Reputation: 2315
I went for an interview today where they give me technical test on sql. One of them was how to delete duplicate records without a primary key.
For one I can't imagine a table without a primary key. Yes I have read the existing threads on this. Say this happened and needed to be fixed Now. Couldn't I just add to the end of the table a automatically incrementing id then use that to delete the duplicate record?
Can anyone think of a reason why that won't work? I tried it on a simple database I created and I can't see any problems
Upvotes: 1
Views: 1062
Reputation: 56877
You've got a couple of options here.
If they don't mind you dropping the table you could SELECT DISTINCT *
from the table in question and then INSERT
this into a new table, DROP
ping the old table as you go. This obviously won't be usable in a Production database but can be useful for where someone has mucked up a routine that's populating a data warehouse for example.
Alternatively you could effectively create a temporary index by using the row number as per this answer. That answer shows you how to use the built in row_number()
function in SQL server but could be replicated in other RDBMS' (not sure which but MySQL certainly) by declaring a variable called @row_num
or equivalent and then using it in your SELECT
statement as:
SET @row_num=0;
SELECT @row_num:=@row_num+1 AS row_num, [REMAINING COLUMNS GO HERE]
Upvotes: 1
Reputation: 12628
One of possible options how to do this:
But this is not always possible in production
Upvotes: 1