Reputation: 107
I am having a problem with Oracle query where the basic goal is to get the last row of every re-occurring rows, but there's a complication that you'll understand from the data:
Suppose I have a table that looks like this:
ID | COL1 | COL2 | COL3 | UPDATED_DATE
------|------|------|------|-------------
001 | a | b | c | 14/05/2013
002 | a | b | c | 16/05/2013
003 | a | b | c | 12/05/2013
You should be able to guess that since columns 1 to 3 have the same values for all 3 rows they are re-occurring data. The problem is, I want to get the latest updated row, which means row #2.
I have an existing query that works if the table is without ID column, but I still need that column, so if anybody could help me point out what I'm doing wrong, that'd be great.
select col1,
col2,
col3,
max(updated_date)
from tbl
order by col1, col2, col3;
The above query returns me row #2, which is correct, but I still need the ID.
Note: I know that I could have encapsulate the above query with another query that selects the ID column based on the 4 columns, but since I'm dealing with millions of records, the re-query will make the app very ineffective.
Upvotes: 2
Views: 20450
Reputation: 92805
Try
WITH qry AS
(
SELECT ID, COL1, COL2, COL3, updated_date,
ROW_NUMBER() OVER (PARTITION BY COL1, COL2, COL3 ORDER BY updated_date DESC) rank
FROM tbl
)
SELECT ID, COL1, COL2, COL3, updated_date
FROM qry
WHERE rank = 1
or
SELECT t1.ID, t2.COL1, t2.COL2, t2.COL3, t2.updated_date
FROM tbl t1 JOIN
(
SELECT COL1, COL2, COL3, MAX(updated_date) updated_date
FROM tbl
GROUP BY COL1, COL2, COL3
) t2 ON t1.COL1 = t2.COL1
AND t1.COL2 = t2.COL2
AND t1.COL3 = t2.COL3
AND t1.updated_date = t2.updated_date
Output in both cases:
| ID | COL1 | COL2 | COL3 | UPDATED_DATE | -------------------------------------------------------- | 2 | a | b | c | May, 16 2013 00:00:00+0000 |
Here is SQLFiddle demo for both queries.
Upvotes: 13