Dana
Dana

Reputation: 32957

SQL Server subquery behaviour

I have a case where I want to check to see if an integer value is found in a column of a table that is varchar, but is a mix of values that can be integers some are just strings. My first thought was to use a subquery to just select the rows with numeric-esque values. The setup looks like:

CREATE TABLE #tmp (
EmployeeID varchar(50) NOT NULL
)

INSERT INTO #tmp VALUES ('aa1234')
INSERT INTO #tmp VALUES ('1234')
INSERT INTO #tmp VALUES ('5678')

DECLARE @eid int
SET @eid = 5678

SELECT * 
FROM (
    SELECT EmployeeID
    FROM #tmp
    WHERE IsNumeric(EmployeeID) = 1) AS UED 
WHERE UED.EmployeeID = @eid

DROP TABLE #tmp

However, this fails, with: "Conversion failed when converting the varchar value 'aa1234' to data type int.".

I don't understand why it is still trying to compare @eid to 'aa1234' when I've selected only the rows '1234' and '5678' in the subquery.

(I realize I can just cast @eid to varchar but I'm curious about SQL Server's behaviour in this case)

Upvotes: 1

Views: 199

Answers (1)

Aaron Bertrand
Aaron Bertrand

Reputation: 280262

You can't easily control the order things will happen when SQL Server looks at the query you wrote and then determines the optimal execution plan. It won't always produce a plan that follows the same logic you typed, in the same order.

In this case, in order to find the rows you're looking for, SQL Server has to perform two filters:

  1. identify only the rows that match your variable
  2. identify only the rows that are numeric

It can do this in either order, so this is also valid:

  1. identify only the rows that are numeric
  2. identify only the rows that match your variable

If you look at the properties of this execution plan, you see that the predicate for the match to your variable is listed first (which still doesn't guarantee order of operation), but in any case, due to data type precedence, it has to try to convert the column data to the type of the variable:

enter image description here

Subqueries, CTEs, or writing the query a different way - especially in simple cases like this - are unlikely to change the order SQL Server uses to perform those operations.

You can force evaluation order in most scenarios by using a CASE expression (you also don't need the subquery):

SELECT EmployeeID
FROM #tmp
WHERE EmployeeID = CASE IsNumeric(EmployeeID) WHEN 1 THEN @eid END;

In modern versions of SQL Server (you forgot to tell us which version you use), you can also use TRY_CONVERT() instead:

SELECT EmployeeID
FROM #tmp
WHERE TRY_CONVERT(int, EmployeeID) = @eid;

This is essentially shorthand for the CASE expression, but with the added bonus that it allows you to specify an explicit type, which is one of the downsides of ISNUMERIC(). All ISNUMERIC() tells you is if the value can be converted to any numeric type. The string '1e2' passes the ISNUMERIC() check, because it can be converted to float, but try converting that to an int...

For completeness, the best solution - if there is an index on EmployeeID - is to just use a variable that matches the column data type, as you suggested.

But even better would be to use a data type that prevents junk data like 'aa1234' from getting into the table in the first place.

Upvotes: 4

Related Questions