Reputation: 359

Delete duplicates from a composite primary key table

I have a table with composite primary key(a,b). I took a backup in temp table but some how there came the duplicate records. now am not able to move the backup records to main table.

kindly suggest some way to delete the duplicates in the temp table based on 2 columns. Table is oracle table with 70 million records around 4 million duplicates

Upvotes: 4

Answers (4)

Richard Tyler Miles

Reputation: 683

I liked @a_horse_with_no_name's answer but got an Error

[HY000][1093] You can't specify target table 'example' for update in FROM clause

I've used this resource to extend the answer. I needed to delete duplicate row for what will is a precursor to containing the columns using MySQL via unique composite index (key).

DELETE FROM wp_example_table_name
WHERE (user_id1, user_id2) in (SELECT * FROM (SELECT user_id1, user_id2
                               FROM wp_example_table_name
                               GROUP BY user_id1, user_id2
                               HAVING COUNT(*) > 1) tmp)

Upvotes: 0

Rohan

Reputation: 2030

Say I have data like


1   user1   1
2   user1   1
3   user1   1
4   user1   3
5   user1   9
6   user2   1
7   user2   1
8   user2   10
9   user3   97

Here the user1(3) and user2(1) have repeated records.

This query will show ONLY those records that are in the repeated case Hence instead of 3 records for user1 it shows just 2


 SELECT BT.NAME, BT.LOCATION
   FROM SO_BUFFER_TABLE_7 BT,
        (SELECT BT.NAME, BT.LOCATION, MAX(ROWID) AS UNQ
           FROM SO_BUFFER_TABLE_7 BT
          GROUP BY BT.NAME, BT.LOCATION
         HAVING COUNT(*) > 1) A
  WHERE BT.ROWID <> A.UNQ
    AND BT.NAME = A.NAME
    AND BT.LOCATION = A.LOCATION

The result is


1   user1   1
2   user1   1
3   user2   1

You can replace the SELECT with DELETE to remove the repeated records.

Hope it helps

Upvotes: 1

Ozair Kafray

Reputation: 13539

There are 3 methods of doing it listed in orafaq
The simplest way to eliminate the duplicate records is to SELECT DISTINCT into a temporary table, truncate the original table and SELECT the records back into the original table. Details here.
This one is for MSSQL Server, but I guess the same principles apply in your case too.

Upvotes: 1

user330315

Reputation:

You have two options:

delete the duplicates in the source table before copying them
don't copy the duplicates in the first place.

The second one is more efficient and can be done using this statement:

INSERT INTO target_table (a,b,c,d,e,f)
SELECT a,b,max(c),max(d),max(e),max(f)
FROM source_table
GROUP BY a,b;

As an alternative (e.g. when some of the columns are defined as CLOB or LONG and you can't apply a max() on them):

INSERT INTO target_table (a,b,c,d,e,f)
SELECT a,b,c,d,e,f
FROM (
    SELECT a,b,c,d,e,f,
           row_number() over (partition by a,b) as rn
    FROM source_table
) 
WHERE rn = 1;

If you really want to first delete the rows, you can do this with the following statement:

DELETE FROM source_table
WHERE rowid NOT IN (SELECT min(rowid)
                    FROM source_table
                    GROUP BY a,b
                    HAVING count(*) > 1)

Upvotes: 3

Delete duplicates from a composite primary key table

Answers (4)

Related Questions