Reputation: 8404
I'm just getting back into SQL Server and I'm building a few tables, and all the "guidelines" I'm reading indicate that a Clustered Index should be placed on Identity fields, because they're "ever increasing". I'm having a hard time getting my head around this. Let's assume I have the following data:
ID Name Date Promoted
1 John Jones 1/1/2019
2 Bill Franklin 1/15/2018
3 Tom Stewart 5/12/2020
4 Jeff Thomas 7/15/2017
Most of my queries will probably be based on Date Promoted. However, I will have a table with personal identification for each person, so there will be a join to another table based on ID.
In this case, is ID the best field to have a Clustered Index on? If so, can someone explain why in the simplest way possible? Assume this table has 500,000 records (I'm really simplifying my problem with the sample data here), would it make sense to put a Clustered Index on Date Promoted, or would you put the Clustered on ID and a Nonclustered on Date Promoted?
Upvotes: 1
Views: 1064
Reputation: 56
First of all, please note that the Clustered Index defines the order the data is stored on the disk. If you have the Clustered Index on the Date Promoted
column then, with your sample data, inserting the fourth record would require moving all three previous records on disk. (I am simplifying this quite a bit.) As a result, that insert would be slower than an insert with the Clustered Index on the ID
column, which would simply append a new record on disk. That is why it is best to use 'always increasing' columns.
Additionally, all non-clustered indexes contain the value(s) of the Clustered Index. So it is best to set the Clustered Index on the 'smallest' columns, when possible, in terms of consumed space. Otherwise, in the future, as the table grows, it may lead to space issues.
Finally, identity columns are commonly used for joining tables, and thus also benefit from being indexed for read performance. If your Clustered Index is on your identity column, that is already covered. Otherwise, you might need a separate index for the identity column to optimize lookups by ID
.
Upvotes: 2