Reputation: 1
Using SQL Server 2012. Querying data from two separate tables with non-overlapping key values. I was able to calculate a common key using CASE
statements that takes the following form:
Query #1:
select
"Region" = case
when Facility = 100 then Region = 'North'
when Facility = 200 then Region = 'South'
when Facility = 300 then Region = 'Midwest'
else Null
end,
Other_Variable_1, Other_Variable_2, ...
from
Data1
Query #2:
select
"Region" = case
when Facility = 63 then Region = 'North'
when Facility = 67 then Region = 'South"
when Facility = 89 then Region = 'Midwest'
else Null
end,
Other_Variable_A, Other_Variable_A, ...
from
Data2
I tried to link the two queries by running a JOIN
statement on the two CASE
statements as follows:
select
a.Other_Variable_1, b.Other_Variable_B
from
Data1 a
inner join
Data2 b on (case
when a.Facility = 100 then Region = 'North'
when a.Facility = 200 then Region = 'South'
when a.Facility = 300 then Region = 'Midwest'
else Null
end) = (case
when b.Facility = 63 then Region = 'North'
when b.Facility = 67 then Region = 'South"
when b.Facility = 89 then Region = 'Midwest'
else Null
end)
The query ran for over an hour before I gave up. Is there a better way to join the two case statements without publishing two separate tables and then running a third query?
Upvotes: 0
Views: 56
Reputation: 6963
I know you don't want extra tables. I just want to share my experience on the same case as yours. This method is more flexible that user can change mapping between Facility and Region. We have to join two tables (10 millions row) with computed value. We solve the performance issue by.
1.Add 1 "Region" column for each table
2.Every night there will be data processing that will update those columns base on specified value by user via MDS (I assume only user know the relation between facility and region)
create table Data1_Mapping_Facility_Region
(
Facility int,
Region nvarchar(60)
)
create index IX_Data1_Mapping_Facility_Region_Facility on Data1 (Facility) include (region)
create table Data2_Mapping_Facility_Region
(
Facility int,
Region nvarchar(60)
)
create index IX_Data2_Mapping_Facility_Region_Facility on Data2 (Facility) include (region)
insert into Data1_Mapping_Facility_Region values(100, 'North'), (200, 'South'), (300, 'Midwest')
insert into Data2_Mapping_Facility_Region values(63, 'North'), (67, 'South'), (89, 'Midwest')
Mapping_Data1_Facility_Region
100 'North'
200 'South'
300 'Midwest'
Mapping_Data2_Facility_Region
63 'North'
67 'South'
89 'Midwest'
Create index for Data1 and Data2
create index IX_Data1_Region on Data1 (Region) include (Other_Variable_1)
create index IX_Data2_Region on Data2 (Region) include (Other_Variable_B)
update Data1 and Data2
update data
set data.Region = map.Region
from Data1 data
inner join Data1_Mapping_Facility_Region map
on data.Facility = map.Facility
update data
set data.Region = map.Region
from Data2 data
inner join Data2_Mapping_Facility_Region map
on data.Facility = map.Facility
Change query to
select a.Other_Variable_1, b.Other_Variable_B
from Data1 a
inner join Data2 b on a.Region = b.Region
Compare query plan
Upvotes: 1
Reputation: 15508
Yes, try to do it completely without the CASE
statement and directly map the values onto each other:
select a.Other_Variable_1, b.Other_Variable_B
from Data1 a
inner join Data2 b
on (a.Facility = 100 and b.Facility = 63)
or (a.Facility = 200 and b.Facility = 67)
or (a.Facility = 300 and b.Facility = 89)
Next, make sure that no primary key is missing -- I have the feeling this is the case here. Otherwise you will get a cartesian product-like join of the data sets in your tables.
Upvotes: 0