Is it more efficient to use the CASE in the original query or in a separate query?

Question

I can't show the real query, but here is an example of the type of thing I'm doing:

select
  t1.contract,
  t1.state,
  t1.status,
  t2.product,
  case
    when t2.product_cost > 100 and t3.status = 'Closed' then 'Success'
    when t2.product_cost <= 100 and t3.status = 'Closed' then 'Semi-Success'
    else 'Failure'
  end as performance,
  t3.sales_date
from contract_table as t1 
left join product_table as t2 on t1.prodkey = t2.prodkey
left join sales_table as t3 on (t1.client_number = t3.client_number and t1.contract=t3.contract)
where t1.client_number in (1, 2, 5, 8, 10)

The tables involved have currently have 27 million records in them and are growing fast.

This query will be put in a view and then used to generate various summaries.

My question is this. Would it be more efficient to join the 3 tables into 1 view that has the detail needed to do the case statements and then run a second query that creates the new variables using the case statements? Or is it more efficient to do what I'm doing here?

Basically, I'm ignorant as to how SQL processes the select statement and accounts for the where statement filtering on the clients from the contract table but not the sales table even though the client_number field is in both.

user4843530 · Accepted Answer

All other things being equal, the only thing I can see that would change the efficiency one way or another would be whether you have where clause conditions in your outer query. If that outer query performed on the view is going to have where clauses that limit the number of records returned, then it would be more efficient to put the case statements on it. That way the case operation will only be performed on the records that pass those conditions, rather than getting performed on every record that passes the view's conditions, only to have those values thrown away by the outer query.

With views, I tend to keep to pretty raw data, as much as possible. Partly, for this reason, so that any query operating on the view, after deciding what rows to use, can do the necessary operations only on those rows.

As for how sql accounts for filtering on the clients from the contract table but not the sales table, think through both the where clause and the joins. The where clause says grab only the records where the contract table's client is 1,2,5,8,10. But the join conditions tell it to only grab the records from sales where that table's client number matches the contract table's client number. If the only records it's grabbing from contract are 1,2,5,8,10, then the only records from sale that will match that join will be the ones where the client numbers are also 1,2,5,8,10. Does that make sense?

Is it more efficient to use the CASE in the original query or in a separate query?

Answers (1)

Related Questions