Ageis
Ageis

Reputation: 2315

How to delete a duplicate record without using primary key

I went for an interview today where they give me technical test on sql. One of them was how to delete duplicate records without a primary key.

For one I can't imagine a table without a primary key. Yes I have read the existing threads on this. Say this happened and needed to be fixed Now. Couldn't I just add to the end of the table a automatically incrementing id then use that to delete the duplicate record?

Can anyone think of a reason why that won't work? I tried it on a simple database I created and I can't see any problems

Upvotes: 1

Views: 1062

Answers (2)

ydaetskcoR
ydaetskcoR

Reputation: 56877

You've got a couple of options here.

If they don't mind you dropping the table you could SELECT DISTINCT * from the table in question and then INSERT this into a new table, DROPping the old table as you go. This obviously won't be usable in a Production database but can be useful for where someone has mucked up a routine that's populating a data warehouse for example.

Alternatively you could effectively create a temporary index by using the row number as per this answer. That answer shows you how to use the built in row_number() function in SQL server but could be replicated in other RDBMS' (not sure which but MySQL certainly) by declaring a variable called @row_num or equivalent and then using it in your SELECT statement as:

SET @row_num=0;
SELECT @row_num:=@row_num+1 AS row_num, [REMAINING COLUMNS GO HERE]

Upvotes: 1

Uriil
Uriil

Reputation: 12628

One of possible options how to do this:

  1. select distinct rows from your table(you can achieve this using group by all columns)
  2. insert result into new table
  3. drop first table
  4. alter second table to name of first one

But this is not always possible in production

Upvotes: 1

Related Questions