Remove duplicate data from mysql table accross two columns

Question

I have a MySQL Table which looks like:
(unique_id, uid_data1, uid_data2, sorting_data1, sorting_data2)

This table is used in a tool, where bidirectional relations weren't supported until now, so the table contains data that looks like (field order according line above):
(1, 1212, 2034, 1, 1)
(2, 2034, 1212, 1, 1)
(3, 4567, 9876, 1, 0)
(4, 9876, 4567, 0, 1)

The table also contains "single-directed" relations, i.e.
(5, 5566, 8899, 1, 9)
=> no row exists for (?, 8899, 5566, 9, 1)

As the tool now supports bidirectional/symmetric relations, I would like to remove the duplicate data from the mysql table - however I'm having some trouble finding an appropriate query to do this.
In the example above I would like to delete the rows with the uids 2 and 4 (as their data is already stored in the rows 1 and 3.

First, I tried to setup a SELECT-Statement to see, which entries would be deleted.
I thought of a JOIN-Query

SELECT x.uid, x.uid_link1, x.uid_link2, y.uid_link1 as 'uid_link2', y.uid_link2 as 'uid_link1'
FROM tx_sdfilmbase_hilfstab x
INNER JOIN tx_sdfilmbase_hilfstab y ON x.uid_link1=y.uid_link2 AND x.uid_link2=y.uid_link1
WHERE ???
ORDER BY x.uid_link1, x.uid_link2

However I'm stuck at the point where I have to tell MySQL to only select "half portion" of the records.
Any suggestions on how to do this?

P.S. Deleting each single record manually in the table isn't an option, as the table contains several thousand rows ;-)

Tony Hopkinson · Accepted Answer

Select t.* from MyTable t
inner join MyTable tt
On t.uid_data1 = tt.uid_data2 and t.uid_data2 = tt.uid_data1 and t.unique_ID > tt.unique_ID

Should find the "second" part of the pair (records 2 and 4 in your example)

If I got it right then

Delete t from MyTable t
inner join MyTable tt
On t.uid_data1 = tt.uid_data2 and t.uid_data2 = tt.uid_data1 and t.unique_ID > tt.unique_ID

should do the job

Remove duplicate data from mysql table accross two columns

Answers (2)

Related Questions