Reputation: 11275
I've got a condition of a duplicate value in some rows - I want all rows that have that duplicate value to be deleted except for one.
So far I've got this, which returns all of the rows that have a duplicate value:
SELECT nid, vid, title, hash
FROM node_revision nr
WHERE EXISTS
(
SELECT 1
FROM node_revision nr
WHERE nr.hash = nr.hash
LIMIT 1, 1
)
But I would like to get all except the one with the lowest vid. How would I go about doing that?
For instance, the data:
nid vid title hash
2065 2081 First news story 77c0f020c160c108ed0f38e572e0ea64
2065 2082 First news story 77c0f020c160c108ed0f38e572e0ea64
2065 2083 First news story 77c0f020c160c108ed0f38e572e0ea64
2065 2084 First news story 77c0f020c160c108ed0f38e572e0ea64
I want to select all except the one with the vid of 2081.
Upvotes: 0
Views: 50
Reputation: 48
DELETE FROM node_revision a
WHERE EXISTS (SELECT 'X'
FROM node_revision b
WHERE b.hash = a.hash
AND b.vid <> (SELECT min(c.vid)
FROM node_revsion c
WHERE c.nid = b.nid))
Upvotes: 0
Reputation: 13110
SELECT nr.nid, nr.vid, nr.title, nr.hash
FROM node_revision nr
WHERE EXISTS(
SELECT 1
FROM node_revision nri
WHERE nri.vid < nr.vid
AND nri.hash = nr.hash
)
Or in plain English: 'Give me all the rows for which there is another row that has a lower vid and the same hash'
Upvotes: 2