frgtv10
frgtv10

Reputation: 5460

Oracle 11g Get distinct of multiple rows

I'm having a table with images and one table with image info (for example: size).

**TABLE `images`**
id | name  | path | status       | changedate
---------------------------------------------
1  | 1.tif | xxx/ | corrected    | 2013/03/13
2  | 1.tif | zzz/ | uncorrected  | 2013/03/13
3  | 1.tif | yyy/ | corrected    | 2013/03/14
4  | 2.tif | zzz/ | correct      | 2013/03/29

**TABLE `imageinfo`** 
imageid | size
------------------
1       | 100x200
2       | 100x200
3       | 100x200
4       | 127x127

For info: It is possible that there are more than 1 image with the same name.

I want to get the fields name, size, status and changedate of the images ordered by the name, status and changedate (ordered in this order).

As you can see its possible that there are more than 1 image with the same name. I only want one row per name, the first row if there are duplicate name images.

What I already have:

SELECT 
  name,
  size,
  status,
  changedate
FROM images
JOIN imageinfo ON images.id = imageinfo.imageid
ORDER BY
  name, 
  status,
  changedate

Executed output:

3 | 1.tif | 100x200 | corrected   | 2013/03/14
1 | 1.tif | 100x200  | corrected   | 2013/03/13
2 | 1.tif | 100x200 | uncorrected | 2013/03/13
4 | 2.tif | 127x127 | correct     | 2013/03/29

Required output:

3 | 1.tif | 100x200 | corrected   | 2013/03/14
4 | 2.tif | 127x127 | correct     | 2013/03/29

A distinct wont work here, thats what I already figured out. But how to do this in Oracle?

Upvotes: 0

Views: 87

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270873

You can identify the first row using row_number and then use that to filter out the duplicates:

select name, size, status, changedate
from (SELECT name, size, status, changedate,
             row_number() over (partition by i.name order by i.changedate desc) as seqnum
      FROM images i JOIN
           imageinfo ii
           ON i.id = ii.imageid
     ) i
where seqnum = 1
order by name, status, changedate;

EDIT:

The row_number() function generates a sequence of integers starting with 1 for groups of rows. The groups of rows are defined by the partition by clause. So, in this case, each set of rows with the same i.name will re-start the numbering. The order of the integers is based on the order by clause. So, in this case, the most recent change date within each group gets a value of 1 and so on.

Upvotes: 4

Related Questions