Reputation: 3243
I understand that the one and only one clustered index is on a table defines how the rows are physically ordered, e.g. in the table
==================================
Contacts
==================================
ID (P.K.) | FirstName | LastName
==================================
1 | 'Donald' | 'Trump'
----------------------------------
2 | 'Crooked' | 'Hillary'
----------------------------------
3 | 'Crazy' | 'Bernie'
would mean that the 3 records are stored physically in the order shown above. But I don't understand why this helpful. Maybe in the case of an auto-incremented primary key with no gaps, like the above example, this helps for queries like
SELECT FirstName+LastName FROM Contacts WHERE ID=2
since the physical ordering enables the lookup if ID=2
to happen in O(1)
time (like getting an element of an array by index). But if the table is like
==================================
Contacts
==================================
ID (P.K.) | FirstName | LastName
==================================
1 | 'Donald' | 'Trump'
----------------------------------
89 | 'Crooked' | 'Hillary'
----------------------------------
12309 | 'Crazy' | 'Bernie'
then the physical ordering disallows O(1)
lookup; the best we can do is O(log(n))
.
So why do we want primary keys to define the physical ordering of rows?
Upvotes: 0
Views: 128
Reputation: 25534
The significance of a clustered index in SQL Server is not "physical ordering" but the fact that row data is available in the leaf pages of the B-tree thus avoiding an additional lookup. The subtree cost is the same as for a nonclustered B-tree index: O(log n).
Physical ordering is really an abstraction of what actually happens inside a clustered index. Pages within extents are stored in allocation order, not necessarily ordered by clustered index key. The index key ordering is maintained in the index allocation map and in the pointer chain whereby each page points to the next (not necessarily adjacent) page. Within pages, rows are also written and stored in allocation order, not key order, and the order will not change unless the page splits. Pages themselves get resorted when indexes are rebuilt but the order is not automatically maintained between rebuilds.
Primary keys aren't necessarily the best choice for a clustered index. Those two concepts are orthogonal to each other.
Upvotes: 2