anoop john
anoop john

Reputation: 231

join index or collect stats which is better in Teradata

Am facing an issue with one of my FACT tables. Through same job, I call a procedure to load this FACT table and then second procedure to collect stats on this fact table.

As part of a new requirement I need to create a join index which will also include the above mentioned fact tables.

I believe that join index will be executed whenever there is a change in any of involved tables.So what will happen in above scenario?.will my collect stats procedure wait for join index execution to complete.or Will there be any contention because of the simulataneous occurance of collect stats and joinindex

Regards, Anoop

Upvotes: 1

Views: 955

Answers (1)

Rob Paller
Rob Paller

Reputation: 7786

The Join Index will automatically be maintained by Teradata when ETL processes add, change, or delete data in the table(s) referenced by the Join Index. The Join Index will have to be removed if you apply DDL changes to table(s) referenced in the Join Index that affect the columns participating in the Join Index or before you can DROP the table(s) referenced in the Join Index.

Statistics collection on either the Join Index or Fact table should be reserved until after the ETL for the Fact table has been completed or during a regular stats maintenance period. Whether you collect stats after each ETL process or only during a regular stats maintenance period is dependent on how much of the data in your Fact table is changing during each ETL cycle. I would hazard a guess that if you are creating a join index to improve performance of querying the fact table you likely do not need to collect stats on the same fact table after each ETL cycle unless this ETL cycle is a monthly or quarterly ETL process. Stats collection on the JI and fact table can be run in parallel. The lock required for COLLECT STATS is no higher than a READ. (It may in fact be an ACCESS lock.)

Depending on you release of Teradata you may be able to take advantage of using the THRESHOLD options to allow the optimizer to determine whether or not statistics do in fact need to be collected. I believe this was included in Teradata 14 as a stepping stone toward the automated statistics maintenance that has been introduced in Teradata 14.10.

Upvotes: 0

Related Questions