Reputation: 83
I have data in my table which looks like the below :
INPUT :
version value code type
PMS 0.00 01 HOURS
000 312.00 01 HOURS
000 0.00 04 HOURS
PMS 0.00 01 NON STOCK
000 835.00 01 NON STOCK
000 835.00 04 NON STOCK
Explanation for the output :
For code = 01, we have 4 records, 2 records for the type = HOURS and 2 records for type = NON STOCK. but choose the record with Version not equal to PMS(due to duplicates in column type). hence we get two outputs with code 01.
000 312.00 01 HOURS
000 835.00 01 NON STOCK
For code = 04, we have 2 records, 1 record for the type = HOURS and 1 record for type = NON STOCK. since we don't have duplicates for column type here, need not compare version in this case. hence we get two outputs with code 01
000 0.00 04 HOURS
000 835.00 04 NON STOCK
(The database Engine is Azure databricks)
EXPECTED OUTPUT :
version value code type
000 312.00 01 HOURS
000 835.00 01 NON STOCK
000 0.00 04 HOURS
000 835.00 04 NON STOCK
Upvotes: 0
Views: 201
Reputation: 48800
You can do:
select
version, value, code, type
from (
select *,
row_number() over(partition by code, type order by version) as rn
from t
) x
where rn = 1
order by code, type
Upvotes: 0
Reputation: 162
Assuming that if there is no duplicate rows you need to pull the record even if the version is PMS. Try this:
select
*
from(
select
a.*,
case when b.code is not null and version <>'PMS' then 1
when b.code is null then 1
else 0 end as filter_val
from
input a
left outer join
(Select
code,
type
from input
group by
code,
type having count(*) > 1) b
on a.code=b.code
and a.type=b.type) x
where filter_val=1;
Upvotes: 1