Steven Lemmens
Steven Lemmens

Reputation: 1491

In a nonclustered index, how are the second, third, fourth ... columns sorted?

I have this question about SQL Server indexes that has been bugging me of late.

Imagine a table like this:

CREATE TABLE TelephoneBook (
    FirstName nvarchar(50), 
    LastName nvarchar(50), 
    PhoneNumber nvarchar(50)
)

with an index like this:

CREATE NONCLUSTERED INDEX IX_LastName ON TelephoneBook (
    LastName, 
    FirstName, 
    PhoneNumber
)

and imagine that this table has hundreds of thousands of rows.

Let's say I want to select everyone whose last name starts with a B and the firstname is 'John'. I would write the following query:

SELECT 
    * 
FROM TelephoneBook 
WHERE LastName like 'B%' 
AND FirstName='John'

Since the index can help to reduce the number of rows we need to scan because it groups all of the LastNames that start with a B anyway, does it also do this for the FirstName? Or does the database scan every row that starts with a B to find the ones with the first name 'John'?

In other words, how are the second, third, fourth, ... columns sorted in an index? Are they alphabetical in this case as well, so it's pretty easy to find Johanna? Or are they in some sort of a random or different order?

EDIT: why I ask, is because I have just read that in the above SELECT statement, the index will only be used to narrow down the search to the records where the lastname starts with a B, but that the index will NOT be used to find all of the rows with Johanna in it (and will resort to scanning all of the 'B' rows). And I'm wondering why that is? What am I not getting?

Upvotes: 2

Views: 56

Answers (3)

Jason A. Long
Jason A. Long

Reputation: 4442

Logically speaking, the index is sorted by key values in the order of the key. So in this case, LastName (sorted as text), FirstName (sorded as text) and then PhoneNumber (sorted as text)... Any included columns are not sorted at all.

In your case, we know that trailing wildcards are still SARGable, so we'd expect to see an index seek narrowing the data down to all data w/ LastNames starting w/ "B", from that data pool, it will be further filtered to include only those rows that have FirstName = 'John'. You can think of it as an index seek followed by a range seek.

Upvotes: 0

Lorentz Aberg
Lorentz Aberg

Reputation: 11

Gordon's answer is correct in this instance with the specified query. In general, you should be aware that it's not so much grouping records together in "buckets" based on the values of the columns, but rather ordering them according to the index's key columns. In other words, your records in this index will be ordered according to LastName, and for records that share the same LastName value they will be further ordered by FirstName value, and then by PhoneNumber value. You didn't specify a sort order for your columns on this index, but SQL Server defaults unspecified sort orders to ASC(ending), so those columns are indeed lexically sorted in the index .

In your particular case, the query optimizer has decided to look at the index for the first column to determine which records to grab, as Gordon's answer mentions, but SQL Server will reorder predicates if the optimizer decides that would be better, and may use more columns of the index or none at all, depending on the query itself and statistics on the records you are querying.

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1270021

As a convenient shorthand, the keys of an index are used for the where clause up to the first inequality. like with a wildcard is considered an inequality.

So, the index will only be used for looking up the first value. However, the entries will probably be scanned to match on the first name, so you will still get index usage.

Of course, the optimizer may decide not to use the index at all, if it decides that a full-table scan is more appropriate.

Upvotes: 2

Related Questions