DM_Morpheus
DM_Morpheus

Reputation: 730

SQL Query Effeciency - Using Left Outer Joins VS Nested Select statement

Say I have two tables, defined as (Assume ID fields are Primary Keys / Unique):

create table TestTable (ID int, Z1 int, Z2 int, Z3 int)

create table ZDetails(zID int, zName varchar(255))

With the following data:

enter image description here

enter image description here

Now, in the TestTable, the columns Z1, Z2, Z3 correspond to zID in ZDetails.

What I want to do is to display data in TestTable along with the zNames fetched from ZDetails.

I see two ways of doing this:

  1. With a Nested Select:
SELECT ID,
        Z1,
        ZN1 = (SELECT zName FROM ZDetails WHERE zID = Z1),
        Z2,
        ZN2 = (SELECT zName FROM ZDetails WHERE zID = Z2)
FROM TestTable
  1. With a LEFT OUTER JOIN
SELECT ID,
    Z1,
    ZD1.zName,
    Z2,
    ZD2.zName
FROM TestTable
    LEFT OUTER JOIN ZDetails ZD1 on Z1 = ZD1.zID
    LEFT OUTER JOIN ZDetails ZD2 on Z2 = ZD2.zID

I find the first one simpler to read but the Estimated Execution Plan looks complex compared to the second one.

Which one do you think would be more efficent? Also, is there a better way to do this? (Without modifying the tables schema)

Upvotes: 0

Views: 69

Answers (2)

gaby awad
gaby awad

Reputation: 1128

The more efficient way is to use the join method instead of the nested select method. When you use the nested query, you are executing the nested query one by one than executing the parent query while in the query using the join method you will be executing a query once.

I attached a image of the difference of join methods that can be used and that are very useful to know.enter image description here

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269483

First, when you have multiple tables in a query, you should always use qualified column names. For the first query, that would be:

SELECT t.ID, t.Z1,
        ZN1 = (SELECT z.zName FROM ZDetails z WHERE z.zID = t.Z1),
        t.Z2,
        ZN2 = (SELECT z.zName FROM ZDetails z WHERE z.zID = t.Z2)
FROM TestTable t;

And similarly for the second query.

Which is better? They do different things. The second can return multiple rows if there are multiple matches in ZDetails. The correlated subqueries would return an error in this case.

In general, you seem to be doing lookups in a reference table. That is more commonly done using a JOIN. But with the right indexes on the tables (which primary keys provide) there is probably little performance difference between the two.

However, the execution plans are different because the code is doing different things (with respect to duplicates).

Upvotes: 1

Related Questions