Reputation: 337
I have a MySQL Table which looks like:
(unique_id, uid_data1, uid_data2, sorting_data1, sorting_data2)
This table is used in a tool, where bidirectional relations weren't supported until now, so the table contains data that looks like (field order according line above):
(1, 1212, 2034, 1, 1)
(2, 2034, 1212, 1, 1)
(3, 4567, 9876, 1, 0)
(4, 9876, 4567, 0, 1)
The table also contains "single-directed" relations, i.e.
(5, 5566, 8899, 1, 9)
=> no row exists for (?, 8899, 5566, 9, 1)
As the tool now supports bidirectional/symmetric relations, I would like to remove the duplicate data from the mysql table - however I'm having some trouble finding an appropriate query to do this.
In the example above I would like to delete the rows with the uids 2 and 4 (as their data is already stored in the rows 1 and 3.
First, I tried to setup a SELECT-Statement to see, which entries would be deleted.
I thought of a JOIN-Query
SELECT x.uid, x.uid_link1, x.uid_link2, y.uid_link1 as 'uid_link2', y.uid_link2 as 'uid_link1'
FROM tx_sdfilmbase_hilfstab x
INNER JOIN tx_sdfilmbase_hilfstab y ON x.uid_link1=y.uid_link2 AND x.uid_link2=y.uid_link1
WHERE ???
ORDER BY x.uid_link1, x.uid_link2
However I'm stuck at the point where I have to tell MySQL to only select "half portion" of the records.
Any suggestions on how to do this?
P.S. Deleting each single record manually in the table isn't an option, as the table contains several thousand rows ;-)
Upvotes: 2
Views: 3332
Reputation: 20330
Select t.* from MyTable t
inner join MyTable tt
On t.uid_data1 = tt.uid_data2 and t.uid_data2 = tt.uid_data1 and t.unique_ID > tt.unique_ID
Should find the "second" part of the pair (records 2 and 4 in your example)
If I got it right then
Delete t from MyTable t
inner join MyTable tt
On t.uid_data1 = tt.uid_data2 and t.uid_data2 = tt.uid_data1 and t.unique_ID > tt.unique_ID
should do the job
Upvotes: 4
Reputation: 1027
So, the one Row will be
uid_link1=1,uid_link2=9
and the other one
uid_link1=9 and uid_link2=1
right?
what about
.. WHERE x.uid_link1 < y.uid_link1 ...
but this will not remove duplicates with uid_link1=uid_link2
edit: or you can use ... WHERE x.unique_id < y.unique_id
Upvotes: 1