Reputation: 2862
I have a query that returns unique set of records consisting several columns . Even though entire row is unique , columns can can have duplicates . I want to keep only rows that consist of distinct values of certain columns and discard the rest of rows . What is the best method to do it
Sample dataset:
col1 col2 col3 col4
10 Red Book1 Large
10 Blue Book1 Small
20 Blue Book1 Small
30 Red Book2 Medium
30 Blue Book2 Small
desired result
col1 col2 col3 col4
10 Red Book1 Large
30 Red Book2 Medium
above example I keep the col3 distinct and discard rest randomly.
Upvotes: 1
Views: 63
Reputation: 22949
This will work for your example:
with test (col1, col2, col3, col4) as
(
select 10, 'Red', 'Book1', 'Large' from dual union all
select 10, 'Blue','Book1', 'Small' from dual union all
select 20, 'Blue','Book1', 'Small' from dual union all
select 30, 'Red', 'Book2', 'Medium' from dual union all
select 30, 'Blue','Book2', 'Small' from dual
)
select col1, col2, col3, col4
from (
select col1, col2, col3, col4,
row_number() over ( partition by col3 order by col2 desc) countCol3
from test
)
where countCol3 = 1
Here I decided to keep, in case of more than one row wit the same value of col3
, the row with the minimum value of col2
; this is only to fit your example, so you should edit the ordering part to better fit your need.
Upvotes: 2