Reputation: 2957
Lets consider a simple table with 2 relevant columns:
bit IsDownloaded not null
datetime DateDownloaded null
I know that IsDownloaded is 1 only when DateDownloaded has a value (This is just an example I understand that IsDownloaded is not neccessary then).
Is there some performance difference between:
SELECT *
FROM files
WHERE IsDownloaded = 1
AND DateDownloaded IS NOT NULL
AND DateDownloaded > '2010-01-01'
and
SELECT *
FROM files
WHERE DateDownloaded IS NOT NULL
AND DateDownloaded > '2010-01-01'
Therefore does it help to:
Consider that there are no indexes applied on any columns.
Upvotes: 2
Views: 1898
Reputation: 10098
If the predicates are correlated (as in your case) there will be no performance benefic including them both, rather the opposite. The optimizer makes assumptions and calculations based on statistical information to estimate the cardinality, which is great. Unfortunatelly, when predicates are correlated, optimizer doesn't know that, and makes the same assumptions as they weren't, so possibly generating a wrong execution plan for the query.
Here's link to Paul White's article on the subject:
http://www.sqlperformance.com/2014/01/sql-plan/cardinality-estimation-for-multiple-predicates
Upvotes: 2
Reputation: 107317
TL;DR
Its all about the indexes, and the selectivity of the query. In the case above, ensuring that there is an index on DateDownloaded
will drive the performance of the query - the IS NOT NULL
and IsDownloaded
checks won't help (if the flag is correlated to the date).
Explanation
With no indexes at all, there would be no alternative for Sql other than to iterate through all rows in the table evaluating your where predicate (full scan).
With an index on DateDownloaded
, the option now exists to use the index to exclude rows from evaluation against the remainder of the predicate (provided that not all files
have DateDownloaded > '2010-01-01'
)
(Indexing the IsDownloaded
flag would probably not be a good idea if this only has 2 states, and if neither state represents less than a couple of % of the data).
So yes, in the general case, you should always use any additional information available, which would further reduce the number of rows to be evaluated (although in truth this is usually an automatic process, since the query would otherwise return the wrong results).
But in the case here, given that as you say the IsDownloaded
flag is only set if there is also a DateDownloaded
date, it would be redundant to include this in the predicate (as is the NOT NULL
) as it will not exclude any additional rows than the > 2010-01-01
criterion.
Upvotes: 4