vic
vic

Reputation: 311

Should I still use short-circuiting in a WHERE clause in SQL to test conditions?

In a question from over 4 years ago, the poster asked if and how the following code could be simplified:

CREATE Procedure getAllEmployeesByDeptAndFlag
@Dept int,
@sal int,
@Flag int
AS
if @flag = 1
    select Department, Salary
    from employee 
    where Department = @dept and Salary > @sal
else 
    select Department, Salary
    from employee 
    where Department = @dept

The answers to this and similar questions always seem to recommend to short circuit a boolean expression using the logical OR operator. Although, notably, most authors don't actually say it in so many words but just leave you with an example that is just supposed to work. Like the accepted answer in this case which consists of just one line:

select * from employee where Department = @dept AND (@flag != 1 OR Salary > @sal)

Having worked with SQL for quite some time, I was confused when I first saw this. If I'm thinking of a where clause that looks something like the typical example WHERE Country = 'USA' OR Country='Canada', we know that the query won't just stop with the evaluation of the full clause after it finds the first condition to be true, it will always also evaluate the second condition and present the results accordingly.

In the answer mentioned above, however, the second condition is not evaluated if the first is found to be true. While I have found out that this behavior is called short-circuiting, I still find it difficult to understand why it works the way it does. To be clear, I understand how this would work in a procedural language such as C but I don't quite understand why it works in SQL and any more detailed insights to this would be appreciated.

Upon researching the subject further, I came across several articles, such as here and here that further explain that this so called short-circuiting will not always work in SQL. Some answers here on SO seem to go in the same direction. This seems to have to do with the fact that the query optimizer is free to choose whichever path it finds best and this may mean that it will evaluate the conditions in a different order.

Would you consider the accepted answer as the correct answer today? If yes, can you elaborate why you don't see short circuiting as an issue in this specific example? What other, maybe better ways exist to get the same result?

UPDATE: As Gordon noticed, the original procedure had an issue. The second condition read @sal < 10000. So, a variable was compared to a constant, there was no relationship between either with the queried table. I'm sure the original poster meant to compare the constant to a value in the data table, so I rewrote both the procedure as well as the given answer to fit this.

Upvotes: 0

Views: 157

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270443

This addresses the original version of the question.

In this logic:

where Department = @dept AND (@flag <> 1 OR @sal < 10000)

Both conditions are (conceptually) evaluated for all rows in the table. I say conceptually, because a SQL query describes the result set; it does not specify the individual steps. So, rows might be skipped using indexes or partitions, for instance.

That said, the second condition consists only of constant values. A smart (and even not so smart) optimizer will recognize this and evaluate it during the compilation phase. So the query that actually gets executed would be one of:

where Department = @dept and <true>
where Department = @dept and <false>

If the latter, a further step in the optimization phase can say: "Hey, this is always false, so I don't have to do anything."

In all cases, though, the result set is based on what the query describes -- just constant portions may be evaluated up-front rather than row-per-row.

Upvotes: 1

Related Questions