Reputation: 8054
I've got a stored procedure that runs a couple queries in a WHILE
loop, and these queries in particular take exponentially longer than the rest of the SP, greatly bogging down my performance.
I've read in a few places that using an IN
vs. an INNER JOIN
can improve performance if you're only pulling columns from the first table, which is exactly what I'm doing.
The problem, however, is that I haven't the first clue on how I can replace my INNER JOIN
with an IN
clause. I'm getting the notion that the syntax for an IN
is very different from an INNER JOIN
, but I don't know where to start.
As a footnote, SQL is not my strong suit. All suggestions are greatly appreciated.
Here's one of the queries. They're both pretty much identical in terms of logic.
SELECT @CurrentKey = riv.Key
FROM ResourceInfo_VW riv
INNER JOIN (SELECT Track,
Code,
[Language]
FROM ResourceInfo_VW
WHERE Key = @CurrentKey) AS riv2
ON riv.Track = riv2.Track
AND riv.Code = riv2.Code
AND riv.[Language] = riv2.[Language]
INNER JOIN UserGroupCourseCatalog_VW ugcc
ON riv.Key = ugcc.Key
WHERE riv.[Type] = 'a'
AND ugcc.UserGroupKey = @UserGroupKey
Upvotes: 1
Views: 6425
Reputation: 8054
It took a bit of thinking, but I optimized this performance greatly by removing all of the INNER JOIN
s from the relevant statements.
Relative to the time it takes to run, this query gets faster and faster based on the number of records being processed.
Here's what I ended up with. You'll notice a few extra variables. These are set at the start of every WHILE
iteration.
SELECT @CurrentTrack = Track,
@CurrentCode = Code,
@CurrentLanguage = [Language]
FROM ResourceInfo_VW riv
WHERE riv.Key = @CurrentKey
SELECT riv.Key
FROM ResourceInfo_VW riv
WHERE riv.[Type] = 'a'
AND riv.Track = @CurrentTrack
AND riv.Code = @CurrentCode
AND riv.[Language] = @CurrentLanguage
AND riv.CourseKey IN (SELECT CourseKey
FROM UserGroupCourseCatalog_VW
WHERE UserGroupKey = @UserGroupKey)
Upvotes: 0
Reputation: 1254
Well very first thing I notice is there is a “correlated sub query” which is really not required. You have select in to variable that means only one value get selected at the end even if multiple KEY exists. In that case what key will get selected in variable is up to SQL server as there is no order by clause.
Consider adding Top and order by so you know in case of multiple row satisfying WHERE Conditions, what will be get in to result. Result should be at least consistent every time you run same query.
I would rewrite the query in as below. If proper indexes are there on table the performance should not be bad.
Note that TOP 1 is added because in the variable you cannot store multiple values anyways. So find out which ONE value you want to store. MIN, MAX, TOP1, etc….
Also I don’t see a need for doing SELF JOIN of the table “ResourceInfo_VW”. You can correct me if I am wrong here.
SELECT
TOP 1
@Key= riv.KEY
FROM ResourceInfo_VW riv
INNER JOIN UserGroupCourseCatalog_VW ugcc
ON riv.KEY = ugcc.KEY AND riv.[Type] = 'a'
WHERE ugcc.UserGroupKey = @UserGroupKey
and just for the note, if you try to understand the logic of why this query is written in current way then only you can go to next step, in your case which is re-writing query using different operator.
Upvotes: 2
Reputation: 18155
One option is to create a table variable to store your intermediate results.
Start by inserting the @UserGroupKey
s inside a while
loop. Then run a single update ... select
against your real data tables. This will be the most expensive part of the operation.
The table variable will now have lookup of UserGroupKey
to Key
and should perform much faster than searching the raw tables.
declare @t table (UserGroupKey nvarchar(255), Key nvarchar(255))
while (...)
begin
insert into @t(UserGroupKey) values (@UserGroupKey)
end
-- this is the expensive part, but it only happens once
update @t
set Key = riv.KEY
FROM ResourceInfo_VW riv
INNER JOIN (SELECT Track,
Code,
[Language]
FROM ResourceInfo_VW
WHERE KEY = @Key) AS riv2
ON riv.Track = riv2.Track
AND riv.Code = riv2.Code
AND riv.[Language] = riv2.[Language]
INNER JOIN UserGroupCourseCatalog_VW ugcc
ON riv.KEY = ugcc.KEY
WHERE riv.[Type] = 'a'
AND ugcc.UserGroupKey = @t.UserGroupKey
while (...)
begin
select @Key = Key
from @t
where @t.UserGroupKey = @UserGroupKey
end
Upvotes: 2
Reputation: 3834
Yes the formats are very different and they technically do different things: This is how a join is done:
SELECT * FROM TABLEA a JOIN TABLEB on a.commonfield = b.commonfield
This is how you use an IN statement
SELECT * FROM TABLEA WHERE commonfield in (SELECT commonfield from tableb)
Upvotes: 2