Naveed Butt
Naveed Butt

Reputation: 2901

Difference between IN and JOIN

I was wondering what was the difference between the two queries mentioned below, because the first one takes more than 10 seconds to execute on the server and the second one executes in less than one second...

UPDATE - I Here are the actual queries and their Execution Plans as copied and pasted from SQL Server (as is), sorry for any inconvenience caused by my previous queries... :(

SELECT  REPLACE(CONVERT(VARCHAR(11), m.PlanDate, 106), ' ', '-') AS ManagmentPlanDate
FROM    ManagmentPlan m
        INNER JOIN Product p ON p.Product_ID = m.ProductID
        INNER JOIN Category c ON c.C_ID = p.C_ID
        LEFT OUTER JOIN Employee e ON e.emp_no = m.PrescribedBy
        LEFT OUTER JOIN dbo.Issue_Stock i ON i.serial_no = m.IssueStockID
        INNER JOIN dbo.Units u ON u.U_ID = p.U_ID
WHERE   ( ( @PatientID IS NULL )
          AND ( @VisitID IS NULL )
          AND ( m.WardRegNo = @WardRegNo )
        )
        OR --Get only cuurent admission TP 
        ( ( @PatientID IS NULL )
          AND ( @WardRegNo IS NULL )
          AND ( VisitID = @VisitID
                AND m.WardRegNo IS NULL
              )
        )
        OR -- Get Only Current OPD visit TP 
        ( ( @WardRegNo IS NULL )
          AND ( @VisitID IS NULL )
          AND ( visitid IN ( SELECT id
                             FROM   PatientVisit
                             WHERE  PatientID = @PatientID ) )
        )
        OR --Get All Visits TP
        ( ( @PatientID IS NULL )
          AND ( @VisitID IS NOT NULL )
          AND ( @WardRegNo IS NOT NULL )
          AND ( ( VisitID = @VisitID )
                OR ( m.WardRegNo = @WardRegNo )
              )
        ) -- Get Current OPD visit and cuurent admission TP (Both)
        AND m.Deleted != 1
        AND m.PatientDeptID = @PatientDeptID
GROUP BY REPLACE(CONVERT(VARCHAR(11), m.PlanDate, 106), ' ', '-')
ORDER BY CAST(REPLACE(CONVERT(VARCHAR(11), m.PlanDate, 106), ' ', '-') AS DATETIME) DESC

Execution Plan of the first Query

and

SELECT  REPLACE(CONVERT(VARCHAR(11), m.PlanDate, 106), ' ', '-') AS ManagmentPlanDate
FROM    ManagmentPlan m
WHERE   m.ProductID IN ( SELECT Product_ID
                         FROM   Product
                         WHERE  C_ID IN ( SELECT    C_ID
                                          FROM      Category )
                                AND U_ID IN ( SELECT    U_ID
                                              FROM      Units ) )
        AND m.PrescribedBy IN ( SELECT  Emp_no
                                FROM    Employee )
        AND m.IssueStockID IN ( SELECT  Serial_No
                                FROM    Issue_Stock )
        AND ( ( @PatientID IS NULL )
              AND ( @VisitID IS NULL )
              AND ( m.WardRegNo = @WardRegNo )
            )
        OR --Get only cuurent admission TP 
        ( ( @PatientID IS NULL )
          AND ( @WardRegNo IS NULL )
          AND ( VisitID = @VisitID
                AND m.WardRegNo IS NULL
              )
        )
        OR -- Get Only Current OPD visit TP 
        ( ( @WardRegNo IS NULL )
          AND ( @VisitID IS NULL )
          AND ( visitid IN ( SELECT id
                             FROM   PatientVisit
                             WHERE  PatientID = @PatientID ) )
        )
        OR --Get All Visits TP
        ( ( @PatientID IS NULL )
          AND ( @VisitID IS NOT NULL )
          AND ( @WardRegNo IS NOT NULL )
          AND ( ( VisitID = @VisitID )
                OR ( m.WardRegNo = @WardRegNo )
              )
        ) -- Get Current OPD visit and cuurent admission TP (Both)
        AND m.Deleted != 1
        AND m.PatientDeptID = @PatientDeptID
GROUP BY REPLACE(CONVERT(VARCHAR(11), m.PlanDate, 106), ' ', '-')
ORDER BY CAST(REPLACE(CONVERT(VARCHAR(11), m.PlanDate, 106), ' ', '-') AS DATETIME) DESC

enter image description here

Although, it solved my problem of speed or optimization of the query, but just was curious as to what exactly is the difference between those two queries, as I thought the first one translates to the second one...

UPDATE - I As you can see, both queries differ in only the JOINS converted to IN statements...

Upvotes: 1

Views: 173

Answers (2)

Lieven Keersmaekers
Lieven Keersmaekers

Reputation: 58491

For one, your first statement retrieves all matching records from both the Products and Category tables whereas your second statement only retrieves all matching rows from Products.

What is the performance difference if you change your first statement to

SELECT p.*
FROM   Products p
       INNER JOIN Category c ON p.CatNo = c.CatNo

Edit

(as mentioned by Martin) note that the number of rows is only identical for both statements if CatNo is unique in the Category table. The INNER JOIN will return as many records as there are in the Category table whereas the IN statement will return as many records as there are unique CatNo in the Category table.

Upvotes: 2

Mark Roberts
Mark Roberts

Reputation: 460

An in clause filters the rows that come back from product. An inner join adds columns from category to the select statement output.

Upvotes: 0

Related Questions