Reputation: 1716
I have a blacklist of people that should never be contacted. When I want to see if a person is in this list, I do the following:
-- Query 1
SELECT *
FROM bldb.dbo.blacklist l
WHERE l.matchcode
= dbo.fn_matchcode('12345','Sesame Street','1','Eddie','Bert')
The query runs very fast, since there is an index on matchcode column, and fn_matchcode
is deterministic.
Think of matchcode as a compressed form of address and name, which helps me not to be affected from typos in street names etc. It consists of 22 chars: 13 for the address, 9 for the name. When I want to see if any person in 1 Sesame Street, 12345 is in blacklist, I do the following:
-- Query 2
SELECT *
FROM bldb.dbo.blacklist l
WHERE LEFT(l.matchcode,13)
= LEFT(dbo.fn_matchcode('12345','Sesame Street','1','Eddie','Bert'),13)
This runs extremely long...
On the contrary, this runs much faster:
-- Query 3
SELECT *
FROM bldb.dbo.blacklist l
WHERE LEFT(l.matchcode,13)
= (SELECT LEFT(dbo.fn_matchcode('12345','Sesame Street','1','Eddie','Bert'),13))
That means, the right hand side of the where condition gets calculated for every row! But why? The UDF is deterministic. Is it LEFT()
, which is not deterministic?
EDIT:
The answers so far claimed that it is because the index does not get used. However, it is still not clear to me why the following happens.
When I write the query like this:
-- Query 4
SELECT *
FROM bldb.dbo.blacklist
WHERE matchcode LIKE LEFT(dbo.fn_matchcode('12345','Sesame Street','1','Eddie','Bert'),13) + '%'
it still takes several minutes to finish. Please note that the fn_matchcode
just does some string manipulation and returns instantly.
When I hardcode the result of fn_matchcode
into the query:
-- Query 5
SELECT *
FROM bldb.dbo.blacklist
WHERE matchcode LIKE '12345SSMSTRT1%'
it takes a couple of milliseconds! How would you explain that?
Upvotes: 3
Views: 6531
Reputation: 81169
For a query to run fast, there must be an index for the thing being queried. If you are going to be querying based upon a value computed from one or more columns, you need to have an indexed column containing the result of that computation.
Upvotes: 0
Reputation: 453298
Following the update in your question can you look at the two execution plans for your queries #4 and #5 and see whether it does a clustered index scan for one and a non clustered index seek for the other? I wonder if it is because it knows the statistics for the literal at compile time but not for the function call. As it has no idea that only a handful of records will be returned it errs on the side of caution to avoid doing a whole load of bookmark lookups.
If that is the case then does the following help?
SELECT *
FROM bldb.dbo.blacklist WITH (FORCESEEK)
WHERE matchcode LIKE
LEFT(dbo.fn_matchcode('12345','Sesame Street','1','Eddie','Bert'),13) + '%'
Upvotes: 3
Reputation: 332581
That means, the right hand side of the where condition gets calculated for every row!
No, you're thinking of correlated subqueries - which this is not.
The last example is fast because the optimizer sees it as a join (due to the SELECT), vs the WHERE clause in the previous examples.
While using a function on a column will render an index on that column useless, cutting down the number of characters on a VARCHAR column (say first 10 of VARCHAR(150)) could be faster due to less to compare. INTs are 4 bytes regardless, but that's not the case for string based data types...
Upvotes: 1
Reputation: 96572
This is an issue of sargability - when you use a function like left in the where clause, the query can't use the indexes which slows it down.
Upvotes: 1
Reputation: 453298
I'd use
SELECT *
FROM bldb.dbo.blacklist l
WHERE l.matchcode LIKE
LEFT(dbo.fn_matchcode('12345','Sesame Street','1','Eddie','Bert'),13) + '%'
To allow the index on matchcode to be used. That doesn't answer your question though but is too long for a comment.
Upvotes: 1