Ashish Agarwal
Ashish Agarwal

Reputation: 6283

Why difference in performance between the TWO queries?

I am using SQL Server 2008, and I have two tables

Table1

contains 3.5 million records

+----+-------------+
| pk | dim1        |
+----+-------------+
indexing applied on column **pk**

Table2

contains 15 million records

+----+-------------+
| fk | fact1       |
+----+-------------+
indexing applied on column **fk**

I ran 2 queries on these table to get t2.fact1

Query1

SELECT t2.fact1 
FROM Table1 AS t1, Table2 AS t2 
WHERE t2.fk = t1.pk

Query2

SELECT t2.fact1 
FROM Table1 
WHERE t2.fk IN (SELECT t1.pk FROM Table1 AS t1)

The result that got was Query1 took 7 secs while Query2 took 6 secs

While some where in blogs I read like if I use IN in query like above, will slower the query.

Question#1: can anybody suggest why Query2 was faster?

In an another experiment on same table (When NO INDEXING was applied) I ran above queries and again query2 was faster than Query1, unfortunately don't remember its timing.

Question#2: can anybody suggest why Query2 was faster?

Upvotes: 0

Views: 152

Answers (1)

gbn
gbn

Reputation: 432692

6 vs 7 seconds could easily be measurement errors

  • Run both and see the actual execution plan
  • Run several times and compare timings
  • The 2nd query could run with data in cache, the first one not. Use DBCC DROPCLEANBUFFERS

Observation: You should use this construct

Select t2.fact1 from
    Table1 as t1 JOIN table2 as t2 on t2.fk=t1.pk

Edit:

  • DBCC DROPCLEANBUFFERS will clear the data cache
  • The JOIN syntax is better and clearer

Of course, now I realise the difference will be caused by query 1 giving different results because of multiplying out rows. That is, you have multiple child rows per parent row. The In won't do this.

I normally go on about this but overlooked it earlier.

What I want to know now is how many rows are returned by each query...

Upvotes: 4

Related Questions