Toadums
Toadums

Reputation: 2812

INNER JOIN execution/evaluation order

I was wondering how exactly inner joins works in mysql.

If I do

SELECT * FROM A a 
INNER JOIN B b ON a.row = b.row
INNER JOIN C c ON c.row2 = b.row2
WHERE name='Paul';

Does it do the joins first, then pick the ones where name = paul? Because when I do it this way, it is SUPER DUPER slow.

is there a way to do something along the lines:

SELECT * FROM (A a WHERE name='paul')
INNER JOIN B b ON a.row = b.row
INNER JOIN C c ON c.row2 = b.row2]

When I try it that way, I just get an error.

or alternately, is it better to just have 3 separate queries, one for A, B and C? example:

string query1 = "SELECT * FROM A WHERE name = 'paul'";
//send query, get data reader
string query2 = "SELECT * FROM b WHERE b = " + query1.b;
//send query, get data reader
string query3 = "SELECT * FROM C WHERE c = " + query1.c;
//send query, get data reader

Obviously this is just pseudo code, but I think it illustrates the point.

Which way is faster/recommended?

Edit Table structure:

**tblTimesheet**
int timesheetID (primary key)
datetime date
varchar username
int projectID
string description
float hours

**tblProjects**
int projectID (primary key)
string project name
int clientID

**tblClients**
int clientID
string clientName

The join that I want is:

select * from tblTimesheet time
INNER JOIN tblProject proj on time.projectID = proj.projectID
INNER JOIN tblClient client on proj.clientID = client.clientID
WHERE username = 'paul';

something like that

Upvotes: 1

Views: 2594

Answers (4)

dash
dash

Reputation: 91490

You are probably missing an index on a key table; you can use the MySql EXPLAIN keyword to help in finding out where your query is slow.

To answer another section of your question;

is there a way to do something along the lines:

SELECT * FROM (A a WHERE name='paul')
INNER JOIN B b ON a.row = b.row
INNER JOIN C c ON c.row2 = b.row2]

You can use a SubQuery;

SELECT * 
FROM (SELECT * FROM  tblTimesheet WHERE username = 'Paul') AS time
INNER JOIN tblProject proj on time.projectID = proj.projectID
INNER JOIN tblClient client on proj.clientID = client.clientID

What this query is effectively doing is attempting to prefilter the fields the JOIN will operate on. Rather than join all the fields to together, and then filter those down, it only attempts to JOIN fields from tblTimesheet where the name is 'Paul' first.

However, the query optimizer should already be doing this so this query should perform similarly to your original query.

For more help with indexes, the understanding of which will aid you greatly in database development, start by looking at a tutorial like this one.

Upvotes: 2

paul
paul

Reputation: 22001

Use your first query, the mySql query optimizer should pick the fastest strategy

if you want it to be faster, make sure that there is an index on the name column

Upvotes: 1

Charleh
Charleh

Reputation: 14002

I think you'll find the query optimiser will give you the best possible query most of the time. You need to look at the execution plan to find out why the query is slow - my guess is lack of indexes.

When MySql looks in these tables, it will usually do it in the best way to get the best speed - a simple join as you've illustrated won't confuse the query optimiser, but missing indexes can cause the database engine to scan tables instead of looking up values (i.e. it needs to walk through the table row by row to match the criteria you specified)

An index ensures that the engine doesn't need to go searching down to the leaf page level and will usually speed up queries

What's the table structure here or is this all hypothetical?

The general rule of thumb with SQL is - try it and see!

Upvotes: 2

millimoose
millimoose

Reputation: 39950

It's fantastically unlikely a join will be slower than three database hits. Reordering the clauses shouldn't have an impact either if MySQL's query optimizer is at all competent. Are the columns in the WHERE / ON clauses indexed?

Upvotes: 2

Related Questions