Select all rows per query except for one

Question

I've got a condition of a duplicate value in some rows - I want all rows that have that duplicate value to be deleted except for one.

So far I've got this, which returns all of the rows that have a duplicate value:

SELECT  nid, vid, title, hash
FROM    node_revision nr
WHERE   EXISTS
        (
        SELECT  1
        FROM    node_revision nr
        WHERE   nr.hash = nr.hash
        LIMIT 1, 1
        )

But I would like to get all except the one with the lowest vid. How would I go about doing that?

For instance, the data:

nid     vid          title                        hash
2065    2081    First news story    77c0f020c160c108ed0f38e572e0ea64
2065    2082    First news story    77c0f020c160c108ed0f38e572e0ea64
2065    2083    First news story    77c0f020c160c108ed0f38e572e0ea64
2065    2084    First news story    77c0f020c160c108ed0f38e572e0ea64

I want to select all except the one with the vid of 2081.

Arth · Accepted Answer

SELECT nr.nid, nr.vid, nr.title, nr.hash
  FROM node_revision nr
 WHERE EXISTS(
  SELECT 1
    FROM node_revision nri
   WHERE nri.vid < nr.vid
     AND nri.hash = nr.hash     
      )

Or in plain English: 'Give me all the rows for which there is another row that has a lower vid and the same hash'

Select all rows per query except for one

Answers (2)

Related Questions