sammarcow
sammarcow

Reputation: 2956

MSSQL Performance - large n on JOIN

SELECT  A.Id, AMerge.FeildA, AMerge.FeildB, AMerge.FeildC, BMerge.FeildD, BMerge.FeildE, BMerge.FeildF, 
FROM 

    (SELECT Id, FieldA, FieldB, FieldC from A1
    UNION ALL 
    SELECT Id, FieldA, FieldB, FieldC from A2
    ) AS A
    INNER JOIN 
    (
    SELECT Id, FieldD, FieldE, FieldF FROM B1
    UNION ALL 
    SELECT Id, FieldD, FieldE, FieldF FROM B2
    )  AS B

ON A.Id = B.Id

where n of A = 8102869, n of B = 17935860, resulting in a table size n=17935860.

How can I refactor this query to be more efficient, or what processes can I perform on the tables or database in order to increase performance for the above query?

Upvotes: 0

Views: 142

Answers (2)

Ian P
Ian P

Reputation: 1724

First you need to have a clustered index on ALL your tables. Without a clustered index your table is a heap and any query will do a table scan - its the only way it can check all the rows.

Second you should have an (complex / muti col) index at least covering any columns you use in any join: ideally with the most granular column first etc.

Thus if you dont have this SQL will mutiply the number of cols in each table together and try to create a temp table of the result.

So if you have 100000 rows in 1 table and 10000 rows in another the calculated row size without indexes will be 1000000000 rows. Heavens know what size temp table that will create!

With indexes (and stats up todate) if the are say 100 rows in one table and 10 rows in another which are likely matches SQL will estimate 1000 rows. Which it can hapilly store in your temp db not to say run a lot faster!

Upvotes: 0

Laurence
Laurence

Reputation: 10976

Can you post the query plan?

It's possible that making sure there is a clustered index on id on all the tables and refactoring to the following will speed things up. Lots of merge joins in the query and no sorts is probably the best plan you can get out of this.

Select
  a1.Id, a1.FieldA, a1.FieldB, a1.FieldC, b1.FieldD, b1.FieldE, b1.FieldF
From 
  A1 Inner Join B1 On A1.ID = B1.ID
Union All
Select
  ...
From
  A2 Inner Join B1 On A2.Id = B1.ID 
Union All 
Select
  ...
From
  A1 Inner Join B2 On A1.Id = B2.ID
Union All
Select
  ...
From
  A2 Inner Join B2 On A2.ID = B2.ID

Also, you've tagged this mysql and sql-server. I'm speaking about Sql Server here, don't know enough about the ins and outs of mysql

Upvotes: 1

Related Questions