peter.petrov
peter.petrov

Reputation: 39477

SQL / T-SQL short-circuit

I have a table in SQL Server 2012 which has these values (they are all integers).

     a      b
     1      1
     1      0
    -1      1
     2      1

This query works OK.

select * from T1 
where
b <> 0 and a / b > 0

This query is failing.

select * from T1 
where
b * b * b <> 0 and a / b > 0

I think I have a slight idea why but still... Why? Could you clearly explain (or point me to some official docs) what does SQL Server short-circuit and what not?

I find statements saying that T-SQL does support short-circuit but if the evaluation order is not guaranteed then... isn't the short-circuiting process ambiguous or let's say not well defined?

I am kind of confused on this.

Upvotes: 2

Views: 331

Answers (3)

Erran Morad
Erran Morad

Reputation: 4753

I had learn this problem from book - Inside Microsoft SQL Server : T-SQL Querying. I copy book content and post it here. Your question is answered after the first example.

Key concept - All-At-Once Operations

SQL supports a concept called all-at-once operations, which means that all expressions that appear in the same logical query processing phase are evaluated as if at the same point in time.

This concept explains why, for example, you cannot refer to column aliases assigned in the SELECT clause within the same SELECT clause, even if it seems intuitively that you should be able to. Consider the following query:

SELECT
  orderid,
  YEAR(orderdate) AS orderyear,
  orderyear + 1 AS nextyear
FROM Sales.Orders;

The reference to the column alias orderyear is invalid in the third expression in the SELECT list, even though the referencing expression appears "after" the one where the alias is assigned. The reason is that logically there is no order of evaluation of the expressions in the SELECT list—it’s a set of expressions. At the logical level all expressions in the SELECT list are evaluated at the same point in time. Therefore this query generates the following error:

Msg 207, Level 16, State 1, Line 4
Invalid column name 'orderyear'.

Here’s another example of the relevance of all-at-once operations: Suppose you had a table called T1 with two integer columns called col1 and col2, and you wanted to return all rows where col2/col1 is greater than 2. Because there may be rows in the table where col1 is equal to 0, you need to ensure that the division doesn’t take place in those cases—otherwise, the query fails because of a divide-by-zero error. So if you write a query using the following format:

SELECT col1, col2
FROM dbo.T1
WHERE col1 <> 0 AND col2/col1 > 2;

You assume that SQL Server evaluates the expressions from left to right, and that if the expression col1 <> 0 evaluates to FALSE, SQL Server will short-circuit; that is, it doesn’t bother to evaluate the expression 10/col1 > 2 because at this point it is known that the whole expression is FALSE. So you might think that this query never produces a divide-by-zero error.

SQL Server does support short circuits, but because of the all-at-once operations concept in ANSI SQL, SQL Server is free to process the expressions in the WHERE clause in any order that it likes. SQL Server usually makes decisions like this based on cost estimations, meaning that typically the expression that is cheaper to evaluate is evaluated first. You can see that if SQL Server decides to process the expression 10/col1 > 2 first, this query might fail because of a divide-by-zero error.

You have several ways to try and avoid a failure here. For example, the order in which the WHEN clauses of a CASE expression are evaluated is guaranteed. So you could revise the query as follows:

SELECT col1, col2
FROM dbo.T1
WHERE
  CASE
    WHEN col1 = 0 THEN 'no' – or 'yes' if row should be returned
    WHEN col2/col1 > 2 THEN 'yes'
    ELSE 'no'
  END = 'yes';

In rows where col1 is equal to zero, the first WHEN clause evaluates to TRUE and the CASE expression returns the string ‘no’ (replace with ‘yes’ if you want to return the row when col1 is equal to zero). Only if the first CASE expression does not evaluate to TRUE—meaning that col1 is not 0—does the second WHEN clause check whether the expression 10/col1 > 2 evaluates to TRUE. If it does, the CASE expression returns the string ‘yes.’ In all other cases, the CASE expression returns the string ‘no.’ The predicate in the WHERE clause returns TRUE only when the result of the CASE expression is equal to the string ‘yes.’ This means that there will never be an attempt here to divide by zero.

This workaround turned out to be quite convoluted, and in this particular case we can use a simpler mathematical workaround that avoids division altogether:

SELECT col1, col2
FROM dbo.T1
WHERE col1 <> 0 and col2 > 2*col1;

I included this example to explain the unique and important all-at-once operations concept, and the fact that SQL Server guarantees the processing order of the WHEN clauses in a CASE expression.

There is more in this link - http://social.technet.microsoft.com/wiki/contents/articles/20724.all-at-once-operations-in-t-sql.aspx

Upvotes: 1

podiluska
podiluska

Reputation: 51504

If you're trying to avoid the divide by zero error, then you can use NULLIF

ie

 select * from T1 where b <> 0 and a / nullif(b,0) > 0

 select * from @t T1 where b * b * b <> 0 and a / nullif(b,0) > 0

will both execute successfully.

If you want to know what's happening, look at the actual execution plan.

The first query will show the predicate of

 [T1].[b]<>(0) 
 AND [T1].[a]/CASE WHEN [T1].[b]=(0) THEN NULL ELSE [T1].[b] END

where the second will evaluate as

[T1].[a]/[T1].[b]>(0) AND [T1].[b]*[T1].[b]*[T1].[b]<>(0)

I would suggest that the optimizer thinks that multiplying three values is more complex than a single division operation, so evaluates that first - indeed, if you change the second query to

 select * from T1 where b * b * b <> 0 and a*a / b > 0

or

 select * from T1 where power(b,3)<> 0 and a / b > 0

it will successfully execute again.

You will also note that changing the order of the filters in the first query makes no difference - the simpler operation is carried out first.

Upvotes: 1

TheMP
TheMP

Reputation: 8427

The specification of SQL short-circuits in SQL server is very blurry. From what I've heard, the only time you can be sure that your query will be lazy-evaluated is CASE instruction with multiple WHEN entries. There is no guarantee even when you are using simple OR/AND expressions. See this article if you want to know more.

Upvotes: 1

Related Questions