Reputation: 359
I have a table with composite primary key(a,b). I took a backup in temp table but some how there came the duplicate records. now am not able to move the backup records to main table.
kindly suggest some way to delete the duplicates in the temp table based on 2 columns. Table is oracle table with 70 million records around 4 million duplicates
Upvotes: 4
Views: 7436
Reputation: 683
I liked @a_horse_with_no_name's answer but got an Error
[HY000][1093] You can't specify target table 'example' for update in FROM clause
I've used this resource to extend the answer. I needed to delete duplicate row for what will is a precursor to containing the columns using MySQL via unique composite index (key).
DELETE FROM wp_example_table_name
WHERE (user_id1, user_id2) in (SELECT * FROM (SELECT user_id1, user_id2
FROM wp_example_table_name
GROUP BY user_id1, user_id2
HAVING COUNT(*) > 1) tmp)
Upvotes: 0
Reputation: 2030
Say I have data like
1 user1 1
2 user1 1
3 user1 1
4 user1 3
5 user1 9
6 user2 1
7 user2 1
8 user2 10
9 user3 97
Here the user1(3) and user2(1) have repeated records.
This query will show ONLY those records that are in the repeated case Hence instead of 3 records for user1 it shows just 2
SELECT BT.NAME, BT.LOCATION
FROM SO_BUFFER_TABLE_7 BT,
(SELECT BT.NAME, BT.LOCATION, MAX(ROWID) AS UNQ
FROM SO_BUFFER_TABLE_7 BT
GROUP BY BT.NAME, BT.LOCATION
HAVING COUNT(*) > 1) A
WHERE BT.ROWID <> A.UNQ
AND BT.NAME = A.NAME
AND BT.LOCATION = A.LOCATION
The result is
1 user1 1
2 user1 1
3 user2 1
You can replace the SELECT with DELETE to remove the repeated records.
Hope it helps
Upvotes: 1
Reputation: 13539
There are 3 methods of doing it listed in orafaq
The simplest way to eliminate the duplicate records is to SELECT DISTINCT into a temporary table, truncate the original table and SELECT the records back into the original table. Details here.
This one is for MSSQL Server, but I guess the same principles apply in your case too.
Upvotes: 1
Reputation:
You have two options:
The second one is more efficient and can be done using this statement:
INSERT INTO target_table (a,b,c,d,e,f)
SELECT a,b,max(c),max(d),max(e),max(f)
FROM source_table
GROUP BY a,b;
As an alternative (e.g. when some of the columns are defined as CLOB
or LONG
and you can't apply a max() on them):
INSERT INTO target_table (a,b,c,d,e,f)
SELECT a,b,c,d,e,f
FROM (
SELECT a,b,c,d,e,f,
row_number() over (partition by a,b) as rn
FROM source_table
)
WHERE rn = 1;
If you really want to first delete the rows, you can do this with the following statement:
DELETE FROM source_table
WHERE rowid NOT IN (SELECT min(rowid)
FROM source_table
GROUP BY a,b
HAVING count(*) > 1)
Upvotes: 3