Reputation: 19
I'm currently working on optimizing a rather lengthy snippet of code on a large table and have a general question that would help me understand SQL optimization a bit better. What I'm trying to accomplish goes as follows.
Suppose Table A has these columns:
X int, Y int, Z int
where X is non distinct i.e many rows with X = x and each Y is distinct.
And Table B has these columns
I int, J int, K int
Then
I = X if Z IN (1,3)
J = Y if Z IN (1,3)
K = 0 if y = Y is the smallest y for a given X
K = 1 if y = Y is the largest y for a given X
K IS NULL otherwise
For example suppose we have table A
X Y Z
1 5 1
1 2 3
1 3 3
2 6 1
2 7 3
3 8 1
3 9 2
3 10 1
3 11 3
Then after the process B should look like
I J K
1 2 0
1 3 NULL
1 5 1
2 6 0
2 7 1
3 8 1
3 10 NULL
3 11 1
Currently I'm straight inserting A into B, pruning those rows where Z != 1 or Z !=3. Then updating table B twice, first to find those instances where K should be 0 and then to find those instances where K should be 1. It seems to me that it should be possible and more optimal to do it all in the initial insert. Being a long time Java programmer, I'm inclined to think about how looping through the rows would make life so much easier so my big questions are,
What's the optimal way to solve this problem?
Does anyone have any T-SQL paradigm tips on what to do when I'm inclined to loop through rows to complete a given task?
Upvotes: 0
Views: 87
Reputation: 15816
The following isn't optimized, but is a straightforward translation of the "requirements" into a select
statement.
Note that supplying sample data in a consumable form would make it easier for us to help you.
-- Sample data.
declare @A as Table ( X Int, Y Int, Z Int );
insert into @A ( X, Y, Z ) values
( 1, 5, 1 ), ( 1, 2, 3 ), ( 1, 3, 3 ), ( 2 ,6, 1 ), ( 2, 7, 3 ),
( 3, 8, 1 ), ( 3, 9, 2 ), ( 3, 10, 1 ), ( 3, 11, 3 );
select * from @A;
-- Generate the results.
select
-- Columns I and J don't require any effort since the where clause
-- eliminates any rows where Z isn't 1 or 3.
I = X,
J = Y,
-- Brute force the rules for K .
-- This is where optimization can occur by getting the Min and Max
-- value of Y for each value of X only once.
K = case
when Y = ( select Min( IA.Y ) from @A as IA where IA.X = A.X ) then 0
when Y = ( select Max( IA.Y ) from @A as IA where IA.X = A.X ) then 1
else NULL end
from @A as A
where Z in ( 1, 3 );
Upvotes: 1