BellevueBob
BellevueBob

Reputation: 9618

How does a WHERE clause work when tables are joined to themselves?

I am re-writing a query that was generated by Business Objects and uses "old-fashioned" implicit join syntax. The code joins a table to itself in one part and also has a "global" where clause. For example:

select   a.col1
       , b.col2
from     MYDB.TABLE a, MYDB.TABLE b
where    a.something=b.something_else
     and MYDB.TABLE.source='A'

Above is a made up illustration of the question. The actual query is very long, joining about ten tables.

My question: as written above, does the "extra" where condition apply to both instances of the same table? I think yes, but I've never seen code like this before. I'm using Teradata but I think this is a general SQL question.

UPDATE: Perhaps my attempt to narrow the question wasn't accurate. Here is the complete query I'm trying to modify:

SELECT
  abs_contrct_prof.contrct_nbr_txt,
  gbs_org_LVL1.bus_pln_sgmnt_cd,
  abs_gnrc_lst_of_val_LVL1.wirls_val_1_txt
FROM
  EDWABSUSERVIEWS.abs_contrct  abs_contrct_prof,
  EDWABSUSERVIEWS.gbs_org  gbs_org_LVL1,
  EDWABSUSERVIEWS.abs_gnrc_lst_of_val  abs_gnrc_lst_of_val_LVL1,
  EDWABSUSERVIEWS.abs_contrct,
  EDWABSUSERVIEWS.gbs_sls_actv_blng_org_rltd,
  EDWABSUSERVIEWS.gbs_sls_actv_org_rltd
WHERE
  ( abs_contrct_prof.type_cd = 'PROFILE'  )
  AND  ( EDWABSUSERVIEWS.abs_contrct.type_cd = 'AGREEMENT'  )
  AND  ( abs_contrct_prof.prnt_contrct_id=EDWABSUSERVIEWS.abs_contrct.contrct_id  )
  AND  ( abs_contrct_prof.org_id=EDWABSUSERVIEWS.gbs_sls_actv_org_rltd.org_id  )
  AND  ( EDWABSUSERVIEWS.gbs_sls_actv_org_rltd.gbs_lvl_3_org_id=EDWABSUSERVIEWS.gbs_sls_actv_blng_org_rltd.gbs_lvl_3_org_id  )
  AND  ( EDWABSUSERVIEWS.gbs_sls_actv_blng_org_rltd.gbs_lvl_1_org_id = gbs_org_LVL1.org_id  )
  AND  ( gbs_org_LVL1.bus_pln_sgmnt_cd=abs_gnrc_lst_of_val_LVL1.nm_txt and abs_gnrc_lst_of_val_LVL1.actv_ind = 'Y' and abs_gnrc_lst_of_val_LVL1.type_cd ='ABS_MOBILITY_SEGMENT'  )
  AND  
  abs_contrct_prof.contrct_nbr_txt  IN  @variable('FAN')
  AND  ( EDWABSUSERVIEWS.abs_contrct.sts_cd = 'Active'  )
  AND  ( EDWABSUSERVIEWS.abs_contrct.type_cd='AGREEMENT'  )

Note the table EDWABSUSERVIEWS.abs_contrct is referenced twice in the FROM clause, one time using an alias and once without. And yes, this query works as written, but I want to re-write it to use explict join syntax (as someone commented).

I ran an EXPLAIN on the query and it appears that the "extra" where conditions (for 'Active' and 'AGREEMENT') are in fact applied to both instances of the table separately.

Upvotes: 2

Views: 3077

Answers (3)

One of the first things a dbms does (acts like it does) in resolving a query is to build a working table from all the table constructors (FROM clauses, JOINs, etc).

Immediately after that, the dbms goes (acts like it goes) to the WHERE clause, and removes from the working table all the rows that don't test as TRUE.

So a valid WHERE clause applies to all the rows in the working table. The irascible Joe Celko, a member of the early SQL standards committee, has written about the order of processing often online. (Search the thread in that link for Effectively materialize.)

Upvotes: 2

Rob Paller
Rob Paller

Reputation: 7786

If your TRANSACTION MODE is TERADATA the optimizer will product join the fully qualified table reference in the WHERE clause against the spool file containing the INNER JOIN results. I have not been able to test this yet with TRANSACTION MODE set to ANSI.

Example

CREATE VOLATILE TABLE Test1
    (
        Col1 SMALLINT NOT NULL,
        col2 VARCHAR(10) NOT NULL
    )
    PRIMARY INDEX (Col1)
    ON COMMIT PRESERVE ROWS;

CREATE VOLATILE TABLE Test2
    (
        Col1 SMALLINT NOT NULL,
        col2 VARCHAR(10) NOT NULL
    )
    PRIMARY INDEX (Col1)
    ON COMMIT PRESERVE ROWS;

SELECT A.Col1
     , B.Col2
     , Test1.Col1
  FROM Test1 A
     , Test2 B
 WHERE A.Col1 = B.Col1
   AND Test1.Col1 = 1;

Explain - Teradata Mode

  1) First, we do an all-AMPs JOIN step from USER.B by way of a RowHash
     match scan with no residual conditions, which is joined to USER.A
     by way of a RowHash match scan with no residual conditions. 
     USER.B and USER.A are joined using a merge join, with a join
     condition of ("USER.A.Col1 = USER.B.Col1").  The result goes into
     Spool 2 (one-amp), which is redistributed by the hash code of (9)
     to all AMPs.  The size of Spool 2 is estimated with low confidence
     to be 1 row (15 bytes).  The estimated time for this step is 0.02
     seconds. 
  2) Next, we do a single-AMP JOIN step from Spool 2 (Last Use) by way
     of an all-rows scan, which is joined to USER.Test1 by way of the
     primary index "USER.Test1.Col1 = 1" with no residual conditions. 
     Spool 2 and USER.Test1 are joined using a product join, with a
     join condition of ("(1=1)").  The result goes into Spool 1
     (all_amps), which is built locally on that AMP.  The size of Spool
     1 is estimated with low confidence to be 1 row (22 bytes).  The
     estimated time for this step is 0.01 seconds. 
  3) Finally, we send out an END TRANSACTION step to all AMPs involved
     in processing the request.
  -> The contents of Spool 1 are sent back to the user as the result of
     statement 1.  The total estimated time is 0.03 seconds. 

UPDATE

INSERT INTO Test1 VALUES (1,'C');
INSERT INTO Test1 VALUES (2,'D');
INSERT INTO Test2 VALUES (1, 'C1');
INSERT INTO Test2 Values (2, 'D2');

Results

Col1  Col2  Test1.Col1
----++----++----------
1     C1    1
2     D2    1

The filter condition was not applied to the tables in the FROM clause.

Upvotes: 3

ypercubeᵀᴹ
ypercubeᵀᴹ

Reputation: 115630

Does the "extra" where condition apply to both instances of the same table?

No, it doesn't. It only applies to one instance of the table.

Plus, the query is not correct written like that, it should (and probably will) give an error in most SQL implementations. You have to name either a or b:

select   a.col1
       , b.col2
from     MYDB.TABLE a, MYDB.TABLE b
where    a.something=b.something_else
     and a.source = 'A'                     --or:--   and b.source = 'A'

Upvotes: 4

Related Questions