Sam F.
Sam F.

Reputation: 197

Short-circuiting SQL statement

MAIN_TABLE has over 100 million records in it; SECURITY_TABLE has over 250 million records in it. I'm trying to retrieve objects from MAIN_TABLE that match the filter criteria and that the current user also has access to (access records are stored in security_table). I'm using something similar to the following to query it:

01 select col1, col2, col3 from main_table
02 where (col4 like '%something%' 
03    or col4 like '%something else%' 
04    or col4 like "%some other thing%')
05 AND
06 col1 in (select st_col1 from security_table 
07    where st_id in (
08        select col1 from main_table
09        where (col4 like '%something%' 
10        or col4 like '%something else%' 
11        or col4 like "%some other thing%'
12        )
13    )
14    AND
15    st_user_id = current_user_id
16)

If I had five matches on the filter criteria in lines 2-4 (criteria A), would the filter criteria in lines 9-11 (criteria B) rescan the entire 100 million records in MAIN_TABLE or only include the five records that were returned from lines 2-4?

Upvotes: 2

Views: 84

Answers (3)

JNK
JNK

Reputation: 65187

It Depends™ on a lot of things, including your RDBMS (SQL Server, Oracle, MySQL, etc).

However, the answer for most of these is maybe?.

SQL Server for instance may check the second criteria if the query analyzer determines that based on indexes and cardinality it will be quicker. They may also very likely be checked in parallel and the have the contents of both checks compared in a hash table to find the intersection.

For your specific circumstance, the nature of the query requires a table scan so it's irrelevant.

Upvotes: 2

user359040
user359040

Reputation:

It would rescan the entire table - the inner subquery is completely separate from the main query, even though it is doing exactly the same thing (and therefore appears to be completely redundant - if you had different criteria in the inner subquery, it would not be redundant).

Upvotes: 0

Matthew
Matthew

Reputation: 10444

Your criterion:

OR LIKE '% ... %' 

is going to require a scan, and an additional scan for each additional similar OR criterion.

When you append the AND clause after line 05, that can be done from within the set returned from the previous condition. However you don't get to control which criterion SQL Server will utilize first. It will try to optimize on its own.

Check your query plan for what it's actually doing.

Upvotes: 0

Related Questions