Teradata performance of joining tables vs joining views

Question

I have a variety of tables that I am joining together. Each table has a primary index and most but not all are partitioned on a date field. Each table has an associated view.

If I write a query in the form

select
*
from view1
join view2
on pi1 = pi2
join view3
on pi1 = pi3
join view4
on pi1 = pi4

...

I run into a out of spool space problem. Would it be better to query the tables directly? Would it be better to create some intermediate tables and do a few joins at a time, then create new indices and partitions on the intermediate tables?

dnoeth · Accepted Answer

Creating intermediate tables should not be neccessary.

Without knowing further details there might be a simple cause:

There a two tables like invoice and invoice_line, the logical PK is (invoice_number) and (invoice_number, line_number).
The Primary INdex of both tables is (invoice_number) to get all rows for an invoice on a single AMP for faster processing.
Both tables are partitioned by invoice_date (in fact keeping the invoice_date in the invoice_line is not needed, because it's the same date for each line. It's done to get matching partitioning on both tables)
The join doesn't inlcude the invoice_date, it's just based on invoice_number. This is correct based on the PK-FK but will result in a very slow join because the optimizer doesn't know which invoice_number is stored in which partition -> all partitions need to be accessed.

In a case like that you must use invoice_date as an additional join condition.

Otherwise you must supply more info:

As already mentioned: you should post the Explain.

Additionally it might help to get the PI definition (plus partitioning) and some statistics information. The easiest way to get the DDL of all objects is a SHOW in front of the select (unless you DBA restricted that), stats are returned by HELP STATS tablename;

Teradata performance of joining tables vs joining views

Answers (2)

Related Questions