scleung
scleung

Reputation: 95

Performance of conditional join

I have a question about perf of the following query:

DECLARE @detail_level INT = 1,
        @DETAIL_1     INT = 1,
        @DETAIL_2     INT = 2,
        @DETAIL_3     INT = 4

SELECT mtbl.*,
       CASE
         WHEN @detail_level & @DETAIL_1 <> 0 THEN tbl_1.value
         ELSE NULL
       END,
       CASE
         WHEN @detail_level & @DETAIL_2 <> 0 THEN tbl_2.value
         ELSE NULL
       END,
       CASE
         WHEN @detail_level & @DETAIL_3 <> 0 THEN tbl_3.value
         ELSE NULL
       END
FROM   mtbl
       LEFT OUTER JOIN tbl_1
         ON @detail_level & @DETAIL_1 <> 0
            AND mtbl.KEY = tbl_1.KEY
       LEFT OUTER JOIN tbl_2
         ON @detail_level & @DETAIL_2 <> 0
            AND mtbl.KEY = tbl_2.KEY
       LEFT OUTER JOIN tbl_3
         ON @detail_level & @DETAIL_3 <> 0
            AND mtbl.KEY = tbl_3.KEY
WHERE  mtbl.KEY = @something 

Will the query engine optimize the query by using the detail level filter to avoid unnecessary table join and unnecessary table row access for those detail table columns?

If yes, will query engine even not acquire index locks for those unnecessary table?

I captured the query plan in SQL Server 2008 and index search operations still existed even detail level did not match. But the query with less details did (~ 30%) faster than the one with full details (data volumn ~ 500,000).

Upvotes: 4

Views: 1135

Answers (2)

Martin Smith
Martin Smith

Reputation: 453618

You would need to check the execution plan to be sure. When I create the following example tables

CREATE TABLE tbl_1([KEY] INT PRIMARY KEY, value INT)
CREATE TABLE tbl_2([KEY] INT PRIMARY KEY, value INT)
CREATE TABLE tbl_3([KEY] INT PRIMARY KEY, value INT)
CREATE TABLE mtbl([KEY] INT PRIMARY KEY, value INT)

INSERT INTO mtbl VALUES(1,1),(2,2) 
INSERT INTO tbl_1 VALUES(1,1),(2,2)
INSERT INTO tbl_2 VALUES(1,1),(2,2)
INSERT INTO tbl_3 VALUES(1,1),(2,2)

your query gives the execution plan

Plan

Whilst this appears to show operations against all four tables of equal costs, this is somewhat misleading. Only two of them are actually accessed. (Costs shown even in actual execution plans are just estimated costs.)

After running SET STATISTICS IO ON;, your example query returns

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tbl_1'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'mtbl'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

As you can see, neither tbl_2 nor tbl_3 feature in the results. Looking at the properties of the index seek on tbl_3 shows that this operator was never executed (Number of Executions = 0)

Seek

This is because the outer join is underneath a filter with a startup predicate and only called when that evaluates to true.

Filter

Regarding your question about locking, no row or page locks are taken against tbl_2 or tbl_3 but IS locks are still taken out against the object. This can be seen by running

DBCC TRACEON(1200,3604,-1)
/*Your query*/ 
DBCC TRACEOFF(1200,3604,-1)

IS locks are not compatible with schema modification, bulk update or exclusive locks on the object.

Upvotes: 4

Andomar
Andomar

Reputation: 238176

SQL Server will create a query plan that will work for any value of the parameters. But it will optimize for the first version of the parameters it encounters. This is called "parameter sniffing".

You can force a recompile with the option (recompile) query hint. A recompile adds 20-30ms to the query time, but the query plan will be optimized for the particular parameter values.

Another option is to use dynamic SQL to generate a query that only joins the required tables. This results in multiple cached execution plans. This performs best, but at the cost of being hard to maintain.

As a side note, the columns of a left joined table are set to null when no row matches the on condition. So you can shorten:

case when @detail_level & @DETAIL_1 <> 0 then tbl_1.value else null end

to:

tbl_1.value

Upvotes: 1

Related Questions