Micah
Micah

Reputation: 10395

Will this composite index help all of these queries?

Let's say I have the following table:

Orders
======
OrderID
CustomerID
StatusID
DateCreated

And I have the following queries:

select CustomerID from Orders where OrderID = 100

select OrderID from Orders where CustomerID = 20

select OrderID, StatusID from Orders where CustomerID = 100 and OrderID = 1000

If I make the following index:

create nonclustered index Example
  On dbo.Orders(OrderID,CustomerID)
  Include(StatusID)

Does that take care of optimization of all 3 queries with one index? In other words, do composite indexes improve queries that use one of the items within the composite? Or should individual indexes be created just on those columns as well (ie OrderID, CustomerID) in order to satisfy queries 1 and 2?

Upvotes: 2

Views: 1633

Answers (3)

Amy B
Amy B

Reputation: 110171

create nonclustered index Example
  On dbo.Orders(OrderID,CustomerID)
  Include(StatusID)

Read this index as: Create a system maintained copy of OrderID, CustomerID, StatusID from the Orders table. Order this copy by OrderID and break ties with CustomerID.

select CustomerID from Orders where OrderID = 100

Since the index is ordered by OrderID, finding the first qualifying record is fast. Once we find the first record, we can continue reading in the index until we find one where OrderID isn't 100. Then we can stop. Since all the columns we want are in the index, we don't have to lookup into the actual table. Great!

select OrderID from Orders where CustomerID = 20

Since the index is ordered by OrderID and then by CustomerID, qualifying records could appear anywhere in the index. The first record might qualify (OrderID = 1, CustomerID = 20). The last record might qualify (OrderID = 1000000000, CustomerID = 20). We must read the whole index to find qualifying records. This is bad. A minor help: since all the columns we want are in the index, we don't have to lookup into the actual table. So, technically the second query is helped by the index - just not to the degree the other queries are helped.

select OrderID, StatusID from Orders where CustomerID = 100 and OrderID = 1000 

Since the index is ordered by OrderID then by CustomerID, finding the first qualifying record is fast. Once we find the first record, we can continue reading in the index until we find a non-qualifying record. Then we can stop. Since all the columns we want are in the index, we don't have to lookup into the actual table. Great!


do composite indexes improve queries that use one of the items within the composite?

Sometimes!

Or should individual indexes be created just on those columns as well (ie OrderID, CustomerID) in order to satisfy queries 1 and 2?

Sometimes not!

The real answer is nuanced by the fact that the order of the columns in the index declaration determines the order of records in the index. Some queries are helped by some orderings, while others aren't. You may need to complement your current index with one more to cover the CustomerID, OrderID case.


"since all the columns we want are in the index, we don't have to lookup into the actual table"-- so the index can be used for reading purposes although not used for seeking/finding purposes?

When an index (which is a copy of a portion of a table) includes all the information needed to resolve the query, the actual table does not need to be read. The index "covers" the query.

Upvotes: 1

Brian White
Brian White

Reputation: 1314

Above answers are correct. But they leave one thing out. You probably want a clustered index on ORDER_ID. And if you create a clustered index on ORDER_ID, then any non-clustered index on the table will automatically include that value, since non-clustered index entries point to the clustered index on tables where there are clustered indexes. So you would want this:

create clustered index IX_ORDERS_ORDERID on ORDERS (OrderID)
go
create nonclustered index IX_ORDERS_CustomerID
On dbo.Orders(CustomerID)
Include(StatusID)
go

Now you have fast search on order id or customer id and all your queries will run well. Do you know how to see the execution plan? In sql studio go to query - include actual execution plan, and then run your queries. You'll see a graphical representation of which indexes are used, whether a seek or scan is executed, etc.

Upvotes: 3

Aaron Bertrand
Aaron Bertrand

Reputation: 280439

This index will not help the second query, since it is not possible to first seek on the left-most column in the index.

Think about a phone book. First, try to find all the people with the last name Smith. Easy, right? Now, try to find all the people with the first name John. The way the "index" in a phone book works is LastName, FirstName. When you're trying to find all the Johns, having them first sorted by LastName is not helpful at all - you still have to look through the whole phone book, because there will be a John Anderson and a John Zolti and everything in between.

To get the most out of all three queries, and assuming these are the only three query forms that are in use, I would potentially suggest an additional index:

create nonclustered index Example2
  On dbo.Orders(CustomerID) INCLUDE(OrderID)

(If OrderID is the primary key, you shouldn't need to INCLUDE it.)

However, this should be tested against your workload, since indexes aren't free - they do require additional disk space, and extra work to be maintained when you are running DML queries. And again, this assumes that the three queries listed in your question are the only shapes you use. If you have other queries with different output columns or different where clauses, all bets are off.

Upvotes: 6

Related Questions