Reputation: 409
In SQL, I have a table where one column can have duplicate values. How do I select only the most recent duplicate value? The dates are spread out over a month
This is my current code that does not account for the duplicate values (Note: COLUMN1 is the column which has duplicate values)
select count(distinct COLUMN1)
from TABLE
where extract(month from DATE_COLUMN) = extract(month from sysdate) - 1
and extract(year from DATE_COLUMN) = extract(year from sysdate)
and COLUMN1 not like '%X';
Thanks
Upvotes: 0
Views: 116
Reputation: 409
I ended up doing this in a different manner than the ones suggested to me. I bascially created a table that was compromised of two different tables, one with the duplicate values and one without.
select DATECOLUMN, count(COLUMN1)
FROM (select COLUMN1, max(DATECOLUMN) "DATECOLUMN"
from TABLE
where extract(month from DATECOLUMN) = extract(month from sysdate) - 1
and extract(year from DATECOLUMN) = extract(year from sysdate)
group by COLUMN1
HAVING count(COLUMN1) > 1
UNION
select COLUMN1, max(DATECOLUMN) "DATECOLUMN"
from TABLE
where extract(month from DATECOLUMN) = extract(month from sysdate) - 1
and extract(year from DATECOLUMN) = extract(year from sysdate)
group by COLUMN1
HAVING count(COLUMN1) < 2) "NEW_TABLE"
where COLUMN1 not like '%X'
group by DATECOLUMN
order by DATECOLUMN;
Upvotes: 1
Reputation: 167794
You can use the ROW_NUMBER()
analytic function to get the most recent row for each value of column1
:
SELECT *
FROM (
SELECT t.*,
ROW_NUMBER() OVER ( PARTITION BY column1
ORDER BY date_column DESC ) AS rn
FROM table_name
)
WHERE rn = 1;
Upvotes: 1
Reputation: 1322
You can use the ROW_NUMBER() function in a subquery to identify dupes and rank them by a date column. I'm not familiar with oracle but it should be something like this:
SELECT ROW_NUMBER() OVER(PARTITION BY COLUMN1 ORDER BY DATE_COLUMN DESC)
Check out this link for an explanation: https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions137.htm
Upvotes: 1