Reputation: 25
I have a database that allows for more than one ethnicity per person. Unfortunately, our answers are essentially Yes Hispanic, Not Hispanic, and Unknown, and there are some who do indeed have multiple selections. I need to run a large query that pulls lots of info, one of which is ethnicity, and I want to "convert" those that have multiple selections as Unknown.
person_ethnicity_xref table:
Person_ID | Ethnicity_ID |
---|---|
1234567 | SLWOWQ |
1234567 | ZLKJDU |
mstr_lists table:
Ethnicity_ID | Ethnicity |
---|---|
SLWOWQ | Hispanic |
ZLKJDU | Not Hispanic |
I've been struggling with this as I can't use a For XML Path with two tables, so I'm now trying to use the logic of Case When count(ethnicity_ID)>1 then 'Unknown' Else Ethnicity End
Here's what I have
select
p.person_nbr,
case
when count(eth1.ethnicity_item_id)>1 then 'Unknown'
else ml1.mstr_list_item_desc
end 'final eth'
from
person_table p
left join person_ethnicity_xref eth1 on p.person_id=eth1.person_id
left join mstr_lists ml1 on eth1.ethnicity_item_id=ml1.mstr_list_item_id
group by
p.person_nbr,
ml1.mstr_list_item_desc
This gave me results but when I check them, those with >1 don't have a value of Unknown and people are listed twice once with each ethnicity.
Another part of this larger query has a subquery in the FROM that counts race and a separate table join for only those with a count=1. Then the case says if the subquery that counts race came up with >1 then X otherwise use that other table for count=1. Because the race table also uses that mstr_list there's then 5 tables involved (there's a second person_id join now that I look at it more closely, and there's a mstr_list to the count and the regular tables...I have no idea why, my brain is tired and that count table isn't a simple count and also is doing something else). Is this really the only option? This query already takes over 10 min to run (it is not run on production!) and I'd hate to make it worse by duplicating what the previous writer did.
Upvotes: 1
Views: 685
Reputation: 15893
I think if you remove ml1.mstr_list_item_desc from group by clause your query would work. And use min(ml1.mstr_list_item_desc)
instead of ml1.mstr_list_item_desc
.
Below query would work.
select
p.person_nbr,
case
when count(eth1.ethnicity_item_id)>1 then 'Unknown'
else min(ml1.mstr_list_item_desc)
end 'final eth'
from
person_table p
left join person_ethnicity_xref eth1 on p.person_id=eth1.person_id
left join mstr_lists ml1 on eth1.ethnicity_item_id=ml1.mstr_list_item_id
group by
p.person_nbr
DB-FIDDLE Example:
Schema and insert statements:
create table person_table(person_id int,person_nbr varchar(10));
insert into person_table values(1234567,'abc');
insert into person_table values(1234568,'xyz');
create table person_ethnicity_xref (Person_ID int, ethnicity_item_id varchar(50));
insert into person_ethnicity_xref values(1234567, 'SLWOWQ');
insert into person_ethnicity_xref values(1234567, 'ZLKJDU');
insert into person_ethnicity_xref values(1234568, 'ZLKJDU');
create table mstr_lists(mstr_list_item_id varchar(50), mstr_list_item_desc varchar(50));
insert into mstr_lists values('SLWOWQ','Hispanic');
insert into mstr_lists values('ZLKJDU','Not Hispanic');
Query:
select
p.person_nbr,
case
when count(eth1.ethnicity_item_id)1 then 'Unknown'
else min(ml1.mstr_list_item_desc)
end 'final eth'
from
person_table p
left join person_ethnicity_xref eth1 on p.person_id=eth1.person_id
left join mstr_lists ml1 on eth1.ethnicity_item_id=ml1.mstr_list_item_id
group by
p.person_nbr
Output:
person_nbr | final eth |
---|---|
abc | Unknown |
xyz | Not Hispanic |
db<fiddle here
Upvotes: 0
Reputation: 1269503
Use aggregation:
select p.person_nbr,
(case when min(ml1.mstr_list_item_desc) = max(ml1.mstr_list_item_desc)
then min(ml1.mstr_list_item_desc)
else 'Unknown'
end) as final_ethnicity
from person_table p left join
person_ethnicity_xref eth1
on p.person_id = eth1.person_id left join
mstr_lists ml1
on eth1.ethnicity_item_id = ml1.mstr_list_item_id
group by p.person_nbr;
Note: This slightly tweaks your logic. If there a multiple ethnicities and they are all the same, then that value is used.
Upvotes: 1