synccm2012
synccm2012

Reputation: 497

Removing duplicates from a single column

I have a table like this:

table

id
=====
1
1
2
2
6
6
7
5
5
9

I need to remove duplicates other than creating a new table. One way of doing it is:

create table_new (select distinct id from table )

Is there any delete command which removes the duplicates?

Upvotes: 1

Views: 1512

Answers (7)

wterbeek
wterbeek

Reputation: 451

First hit on google:

http://support.microsoft.com/kb/139444/en-us?fr=1

The article describes the use of a temporary table to hold the unique records, clearing the original table and adding the unique records back to it, you can probably do it by writing some stored procedure but i suppose you only have to do this once? then i would go with the method as described. Also to prevent future problem use the unique or primary key constraint on that field so that it is not possible to have 2 records with the same value.

Upvotes: 0

valex
valex

Reputation: 24134

IMHO in common case (Ansi SQL) you can't do it because without unique key field you can't distinguish one row from another so you can't delete it.

But in MySQL you can run this UGLY query. It uses @S sting to accumulate ID's so DON'T use it on a big table:

set @S:=',';
delete from t 
where if(LOCATE(CONCAT(',',ID,','),@S)>0,'del',@S:=CONCAT(@S,ID,','))='del'

Upvotes: 1

GeeksCook
GeeksCook

Reputation: 81

I'm not sure if there is any direct implementation, but here is a function implementation which you might be interested at.

function RemoveDuplicates($TableName, $UniqueFieldName, $IDFieldName, $FirstFoundIDValue)
{
    $Query =    "DELETE  FROM ".$TableName." ".
                "WHERE    ".$IDFieldName." IN ".
                "( SELECT a.".$IDFieldName." ".
                    "FROM ".$TableName." a, ".$TableName." b ".
                    "WHERE (a.".$UniqueFieldName." = b.".$UniqueFieldName.") ".
                    "AND (a.".$IDFieldName." > ".$FirstFoundIDValue.") ".
                    "GROUP BY a.".$IDFieldName." ".
                    "HAVING COUNT(a.".$UniqueFieldName.") > 1 ".
                ")";
    db_query($Query);

// Example Use:            
// delete  from tblps_CommEquip 
// where PSID in    
// ( select a.PSID from tblps_CommEquip a, tblps_CommEquip b     
//      where    (a.CommEquipType = b.CommEquipType) 
//      AND (a.PSID > 80)    
//      group by a.PSID 
//      having count(a.CommEquipType) > 1
// )
} 

Upvotes: 0

lojuego
lojuego

Reputation: 11

The best way for deleting duplicate rows is to provide a primary key column that uniquely identifies each record in the table.

Upvotes: 0

Kamal
Kamal

Reputation: 5522

A single query to delete all duplicates (tested in Oracle)

delete from my_table where rowid not in (select min(rowid) from temp_table group by id);

Explanation:

Oracle gives every row a rowid by default (I am sure other databases have same or similar thing).

Let's look at the inner query first. Group by clause groups the rows based on some criteria, in our case Id. We can group based on multiple columns as well (group by col1, col2). When we say min(rowid), we are just making sure that it return one rowid in case there are multiple rows with duplicate columns.

The outer delete query is simple, it just deletes everything else (duplicates) from table which is not returned by our inner query

Upvotes: 0

Satinder singh
Satinder singh

Reputation: 10198

Sample code

http://sqlfiddle.com/#!3/56182/2

Upvotes: 0

Robert
Robert

Reputation: 8767

You can use cte:

;WITH cte
     AS (SELECT ROW_NUMBER() OVER (PARTITION BY ID 
                                       ORDER BY ( SELECT 0)) RN
         FROM   #MyTable)
DELETE FROM cte
WHERE  RN > 1

This will avoid the need to create a temporary or holding table for your distinct records. If you have additional columns that may contain a duplicate value, then you can add them after ID in the PARTITION BY expression.

Although I would be concerned as to why you have duplicate ID values.

  • If this is a row ID, then I would suggest using IDENTITY and auto-increment the IDs.
  • If it is an item or reference ID, then I would try to avoid adding duplicates (this could be completely wrong depending on your structure)

Upvotes: 0

Related Questions