Roger
Roger

Reputation: 2952

Do I misunderstand joins?

I'm trying to learn the the ansi-92 SQL standard, but I don't seem to understand it completely (I'm new to the ansi-89 standard as well and in databases in general).

In my example, I have three tables kingdom -< family -< species (biology classifications).

Why this may happen?

Say a biologist, finds a new species but he has not classified this into a kingdom or family, creates a new family that has no species and is not sure about what kingdom it should belong, etc.

here is a fiddle (see the last query): http://sqlfiddle.com/#!4/015d1/3

I want to make a query that retrieves me every kingdom, every species, but not those families that have no species, so I make this.

    select *
    from reino r
         left join (
             familia f             
             right join especie e
                 on f.fnombre = e.efamilia
                 and f.freino = e.ereino
         ) on r.rnombre = f.freino 
           and r.rnombre = e.ereino;

What I think this would do is:

  1. join family and species as a right join, so it brings every species, but not those families that have no species. So, if a species has not been classified into a family, it will appear with null on family.

  2. Then, join the kingdom with the result as a left join, so it brings every kingdom, even if there are no families or species classified on that kingdom.

Am I wrong? Shouldn't this show me those species that have not been classified? If I do the inner query it brings what I want. Is there a problem where I'm grouping things?

Upvotes: 5

Views: 219

Answers (4)

ypercubeᵀᴹ
ypercubeᵀᴹ

Reputation: 115530

If you want the query rewritten with only the slightest change from what you have, you can change the LEFT join to a FULL join. You can further remove the redundant parenthesis and the r.rnombre = f.freino from the ON condition:

select *
from reino r
     full join                      --- instead of LEFT JOIN
         familia f             
         right join especie e
             on f.fnombre = e.efamilia
             and f.freino = e.ereino
       on r.rnombre = e.ereino;
                                ---removed the:    r.rnombre = f.freino    

Upvotes: 1

Michael Fredrickson
Michael Fredrickson

Reputation: 37388

You're right on your description of #1... the issue with your query is on step #2.

When you do a left join from kingdom to (family & species), you're requesting every kingdom, even if there's no matching (family & species)... however, this won't return you any (family & species) combination that doesn't have a matching kingdom.

A closer query would be:

select *
    from reino r
         full join (
             familia f             
             right join especie e
                 on f.fnombre = e.efamilia
                 and f.freino = e.ereino
         ) on r.rnombre = f.freino 
           and r.rnombre = e.ereino;

Notice that the left join was replaced with a full join...

however, this only returns families that are associated with a species... it doesn't return any families that are associated with kingdoms but not species.

After re-reading your question, this is actually want you wanted...


EDIT: On further thought, you could re-write your query like so:

select *
from 
    especie e
    left join familia f 
        on f.fnombre = e.efamilia
        and f.freino = e.ereino
    full join reino r
        on r.rnombre = f.freino 
        and r.rnombre = e.ereino;

I think this would be preferrable, because you eliminate the RIGHT JOIN, which are usually frowned upon for being poor style... and the parenthesis, which can be tricky for people to parse correctly to determine what the result will be.

Upvotes: 2

onedaywhen
onedaywhen

Reputation: 57023

In case this helps:

Relationally speaking, [OUTER JOIN is] a kind of shotgun marriage: It forces tables into a kind of union—yes, I do mean union, not join—even when the tables in question fail to conform to the usual requirements for union. It does this, in effect, by padding one or both of the tables with nulls before doing the union, thereby making them conform to those usual requirements after all. But there's no reason why that padding shouldn't be done with proper values instead of nulls, as in this example:

SELECT SNO , PNO 
FROM   SP 
UNION  
SELECT SNO , 'nil' AS PNO 
FROM   S 
WHERE  SNO NOT IN ( SELECT SNO FROM SP )

The above is equivalent to:

SELECT SNO , COALESCE ( PNO , 'nil' ) AS PNO 
FROM   S NATURAL LEFT OUTER JOIN SP

Source: SQL and Relational Theory: How to Write Accurate SQL Code By C. J. Date

Upvotes: 2

Tobi
Tobi

Reputation: 1438

Try to use this:

select * 
from reino r 
join especie e on (r.rnombre = e.ereino) 
join familia f on (f.freino = e.ereino and f.fnombre = e.efamilia)

could it be, that you interchanged efamilia and enombre in table especie?

Upvotes: 0

Related Questions