Vojtech B
Vojtech B

Reputation: 2957

T-SQL - Do more conditions help the query proceed faster?

Lets consider a simple table with 2 relevant columns:

bit IsDownloaded not null
datetime DateDownloaded null

I know that IsDownloaded is 1 only when DateDownloaded has a value (This is just an example I understand that IsDownloaded is not neccessary then).

Is there some performance difference between:

SELECT * 
FROM files 
WHERE IsDownloaded = 1 
  AND DateDownloaded IS NOT NULL 
  AND DateDownloaded > '2010-01-01'

and

SELECT * 
FROM files 
WHERE DateDownloaded IS NOT NULL 
  AND DateDownloaded > '2010-01-01'

Therefore does it help to:

  1. Add "easier to evaluate" conditions (such as conditions on boolean datetypes)
  2. Add mode conditions in general

Consider that there are no indexes applied on any columns.

Upvotes: 2

Views: 1898

Answers (2)

dean
dean

Reputation: 10098

If the predicates are correlated (as in your case) there will be no performance benefic including them both, rather the opposite. The optimizer makes assumptions and calculations based on statistical information to estimate the cardinality, which is great. Unfortunatelly, when predicates are correlated, optimizer doesn't know that, and makes the same assumptions as they weren't, so possibly generating a wrong execution plan for the query.

Here's link to Paul White's article on the subject:

http://www.sqlperformance.com/2014/01/sql-plan/cardinality-estimation-for-multiple-predicates

Upvotes: 2

StuartLC
StuartLC

Reputation: 107317

TL;DR
Its all about the indexes, and the selectivity of the query. In the case above, ensuring that there is an index on DateDownloaded will drive the performance of the query - the IS NOT NULL and IsDownloaded checks won't help (if the flag is correlated to the date).

Explanation
With no indexes at all, there would be no alternative for Sql other than to iterate through all rows in the table evaluating your where predicate (full scan).

With an index on DateDownloaded, the option now exists to use the index to exclude rows from evaluation against the remainder of the predicate (provided that not all files have DateDownloaded > '2010-01-01')

(Indexing the IsDownloaded flag would probably not be a good idea if this only has 2 states, and if neither state represents less than a couple of % of the data).

So yes, in the general case, you should always use any additional information available, which would further reduce the number of rows to be evaluated (although in truth this is usually an automatic process, since the query would otherwise return the wrong results).

But in the case here, given that as you say the IsDownloaded flag is only set if there is also a DateDownloaded date, it would be redundant to include this in the predicate (as is the NOT NULL) as it will not exclude any additional rows than the > 2010-01-01 criterion.

Upvotes: 4

Related Questions