mBrice1024
mBrice1024

Reputation: 838

SQL Server indexing includes questions

I've been trouble shooting some bad SQL calls in my works applications. I've been reading up on indexes, tweaking and benchmarking things. Here's some of the rules I've gathered (let me know if this sounds right):

Now one thing I'm having trouble finding info on is what if a query is selecting on columns that are not part of any index but is using a where statement that is? Is the index used and leaf node hits the table and looks at the associated row for it?

ex: table

Id col1 col2 col3

CREATE INDEX my_index
ON my_table (col1)

SELECT Id, col1, col2, col3
FROM my_table
WHERE col1 >= 3 AND col1 <= 6

Is my_index used here? If so, how does it resolve Id, col2, col3? Does it point back to table rows and pick up the values?

Upvotes: 1

Views: 75

Answers (2)

Mathusuthanan
Mathusuthanan

Reputation: 117

From dba.stackexchange.com:

There are a few concepts and terms that are important to understand when dealing with indexes. Seeks, scans, and lookups are some of the ways that indexes will be utilized through select statements. Selectivity of key columns is integral to determining how effective an index can be.

A seek happens when the SQL Server Query Optimizer determines that the best way to find the data you have requested is by scanning a range within an index. Seeks typically happen when a query is "covered" by an index, which means the seek predicates are in the index key and the displayed columns are either in the key or included. A scan happens when the SQL Server Query Optimizer determines that the best way to find the data is to scan the entire index and then filter the results. A lookup typically occurs when an index does not include all requested columns, either in the index key or in the included columns. The query optimizer will then use either the clustered key (against a clustered index) or the RID (against a heap) to "lookup" the other requested columns.

Typically, seek operations are more efficient than scans, due to physically querying a smaller data set. There are situations where this is not the case, such as a very small initial data set, but that goes beyond the scope of your question.

Now, you asked how to determine how effective an index is, and there are a few things to keep in mind. A clustered index's key columns are called a clustering key. This is how records are made unique in the context of a clustered index. All nonclustered indexes will include the clustered key by default, in order to perform lookups when necessary. All indexes will be inserted to, updated to, or deleted from for every respective DML statement. That having been said, it is best to balance performance gains in select statements against performance hits in insert, delete, and update statements.

In order to determine how effective an index is, you must determine the selectivity of your index keys. Selectivity can be defined as a percentage of distinct records to total records. If I have a [person] table with 100 total records and the [first_name] column contains 90 distinct values, we can say that the [first_name] column is 90% selective. The higher the selectivity, the more efficient the index key. Keeping selectivity in mind, it is best to put your most selective columns first in your index key. Using my previous [person] example, what if we had a [last_name] column that was 95% selective? We would want to create an index with [last_name], [first_name] as the index key.

I know this was a bit long-winded answer, but there really are a lot of things that go into determining how effective an index will be, and a lot things you must weigh any performance gains against.

Upvotes: 0

Olivier De Meulder
Olivier De Meulder

Reputation: 2501

To answer your question, yes, my_index is used. And yes, your index will point back to the table rows and pick the id, col2 and col3 values there. That is what an index does.

Regarding your 'rules'

  • Rule 1 makes sense. Except for the fact that I usually do not 'include' other columns in my index. As explained above, the index will refer back to the table and quickly retrieve the row(s) that you need.

  • Rule 2, I don't really understand. You create the index and SQL Server will decide which indices to use or not use. You don't really have to worry about it.

  • Rule 3, the order does not really make a difference.

I hope this helps.

Upvotes: 1

Related Questions