silkfire
silkfire

Reputation: 25965

Peculiar behavior: AND without WHERE still works

I wonder why this query still works perfectly. I thought a WHERE clause always has to start with WHERE?

    SELECT `persons`.*
    FROM `persons`
    LEFT JOIN `team_memberships`
    ON (`team_memberships`.`participant` = `persons`.`id`)
    JOIN `teams`
    ON (`teams`.`id` = `team_memberships`.`team`)
    JOIN `departments`
    ON (`departments`.`id` = `teams`.`department`)
    JOIN `areas`
    ON (`areas`.`id` = `departments`.`area`)
    JOIN `companies`
    ON (`companies`.`id` = `areas`.`company`)

    [NO WHERE HERE]

    AND `persons`.`id` = ?

Upvotes: 3

Views: 95

Answers (6)

Sepster
Sepster

Reputation: 4849

You've just added an AND to your ON clause. The ON clause just evaluates to a boolean true or false, and so AND and eg OR boolean operators are allowed, with all the usual rules about parenthesis etc applying also.

It's often advantageous to move conditions that are specific to a particular joined table, from your WHERE clause and into the appropriate ON clause - if for no other reason than to improve readability.

It can make logical* sense to limit the results of a join before going on to join on other tables, rather than returning a big result set and then filtering this at the end with a WHERE clause. I say *logical because often the DBMS will optimise your query so that this happens anyhow... but not always... sometimes poorly performing queries can be improved by filtering at the join (and sometimes the opposite is true too).

For example, if you know that you're only interested in people over the age of 25, it makes sense to JOIN persons ON persons.id = team_memberships.participant AND persons.age > 25, rather than including all the results for the subsequent tables that only pertain to young punks only to filter these results at the end again with your WHERE persons.age > 25 clause.

My understanding is that join performance is likely to degrade as a result of additional expression(s) in the ON clause, in the instance where the additional expression(s):

  • causes an evaluation to occur on each row in the join (so my age > 25 example above might be in this category), or

  • rely on non-indexed columns from one/both joined tables, when the join would otherwise have relied only on indexed columns.

So eg it might still prove worthwhile stripping the young punks from the result set at the join even if it makes that particular join slower, because the smaller result set at that point might make a big performance increase for subsequent joins. But if their inclusion at the join yields only a slight increase in the number of records, it might actually be quicker to just include them at the join and then filter them out at the WHERE clause.

But I welcome comments clarifying/correcting my understanding.


EDIT: As per @ITroubs comment below, I really should have clarified that if the join is an INNER JOIN then the end resultset will be the same regardless of if the additional filter conditions are in the ON or WHERE clause, but for eg in OP's original example the LEFT JOIN ON team_memberships would yield a completely different resultset if the filter conditions are moved from the WHERE to the LEFT JOIN ... ON ....

Upvotes: 6

user2191247
user2191247

Reputation:

It's valid, see http://dev.mysql.com/doc/refman/5.0/en/join.html, but note how they use brackets.

Upvotes: -1

J0HN
J0HN

Reputation: 26941

It effectively means

join  `companies` on ON (`companies`.`id` = `areas`.`company`) AND persons.id = ?

So, since you are performing inner joins - it's *almost*equvivalent. As soon as you switch to puter joins or complex queries - something will break.

Upvotes: 0

DarkAjax
DarkAjax

Reputation: 16223

It works because you don't actually have a WHERE clause, your condition AND persons.id = ? is part of the last JOIN

Upvotes: 0

ulentini
ulentini

Reputation: 2412

Last AND is not a part of WHERE clause, it's part of JOIN clause

Upvotes: 0

ITroubs
ITroubs

Reputation: 11215

if I am not mistaken then mysql executes your query as follows:

SELECT `persons`.*
FROM `persons`
LEFT JOIN `team_memberships`
ON (`team_memberships`.`participant` = `persons`.`id`)
JOIN `teams`
ON (`teams`.`id` = `team_memberships`.`team`)
JOIN `departments`
ON (`departments`.`id` = `teams`.`department`)
JOIN `areas`
ON (`areas`.`id` = `departments`.`area`)
JOIN `companies`
ON ((`companies`.`id` = `areas`.`company`) AND `persons`.`id` = ?')

Upvotes: 2

Related Questions