Bob
Bob

Reputation: 121

Teradata - Primary Index Fields and Performance

I have a very large volatile table. I set the primary index to customer_id, which is the most unique column on the table. However, I can make the primary index more unique by combining it with customer_id, effective_dt, and group_id.

I can make the primary index even more unique by adding more fields, but I was wondering if there is a point of diminishing return for adding more columns?

Also, I noticed that as the more columns I add a unique primary index, the longer it takes to create the volatile table. Just my observation, but I am not sure if this is true?

Thanks in advance.

Upvotes: 0

Views: 529

Answers (2)

Joe Laert
Joe Laert

Reputation: 136

Am I wrong in thinking that if your table is unique at customer level, then adding also the date field doesn't make it anymore unique?

Upvotes: 0

dnoeth
dnoeth

Reputation: 60482

Basically the PI of a table should be based on following criteria (in order of importance):

  1. Access: Pick a column (or column combination) which is accessed a lot. In best case used for joins (i.e. a FK) plus possibly for WHERE-conditions based on equality

  2. Distribution: The table should not be skewed too much (much might be defined differently for different customers). Usually skew up to 10-30% (AMP with maximum size for this table has 10-30% more data than average) is acceptable. Query dbc.TableSizeV to get the actual Permspace used on each AMP for a table.

  3. Volatility: PI should be stable, not lots of updates.

The Primary Key of a table perfectly fits for two out of three, but might not fit the most important one.

In your case customer_id sounds like a column which will be joined and has a good distribution, so it's probably better than adding more columns to the PI.

The time for creating a Volatile Table should hardly differ with different PIs (unless one is skewed) and I would expect more unique PIs to be a bit more efficient.

Upvotes: 1

Related Questions