Reputation: 40092
Let's say I have the following table:
Table: RelationshipType
============================================================
| ID (PK) | ParentID | ChildID | RelationshipType |
============================================================
There are mostly cases where ParentID
and ChildID
are selected on individually:
... WHERE ParentID = @SomeID
... WHERE ChildID = @SomeID
And sometimes both are selected on:
... WHERE ParentID = @SomeID AND ChildID = @SomeOtherID
I want to increase the performance of these queries but most notably the first two. Should I create a non-clustered index on ParentID
+ ChildID
together or one index on ParentID
and another index on ChildID
?
EDIT: All of these queries are highly selective (1 or 2 records returned).
Upvotes: 2
Views: 3185
Reputation: 52157
Can you get rid of the surrogate key ID
?
If yes, consider creating the following:
{ParentID, ChildID}
.{ChildID, ParentID}
, but include the RelationshipType
in the index as well (use the INCLUDE keyword).This way, you have a covering index in all 3 cases, so you don't have to pay the price of the double-lookup (that is normally required for secondary indexes in clustered tables):
... WHERE ParentID = @SomeID
can be satisfied by a simple seek in the B-Tree of the index: {ParentID, ChildID}
. The value of ChildID
and RelationshipType
1 can be retrieved directly from the found leaf of this B-Tree.... WHERE ChildID = @SomeID
can be satisfied by a simple seek in the B-Tree of the index: {ChildID, ParentID}
. The value of ParentID
and RelationshipType
2 can be retrieved directly from the found leaf of this B-Tree.... WHERE ParentID = @SomeID AND ChildID = @SomeOtherID
can be satisfied by either.1 The clustering key is the "main" B-Tree for the table and includes all columns, not just those that are unique.
2 Thanks to INCLUDE (RelationshipType)
.
Doing something similar with the ID
present is possible, but would require 3 indexes instead of 2 and all of them would be fatter to achieve covering. You'd have to measure to make sure, but my feeling is that this would be more trouble than it's worth.
Otherwise, don't use clustering at all. Just create normal indexes on:
{ID}
- regular, non-clustering primary index (use the NONCLUSTERED keyword).{ParentID}
- regular secondary index.{ChildID}
- regular secondary index.You'll have a normal heap table, so each access will require an index seek + (usually) table heap access, but your indexes will be kept slim, raising the cache effectiveness.
... WHERE ParentID = @SomeID AND ChildID = @SomeOtherID
would require two index seeks (or possibly a seek on either {ParentID}
or {ChildID}
index + table heap access), but this is still pretty fast and is not too frequent (as you stated).
Please do measure on realistic amounts of data before deciding either way.
Upvotes: 4
Reputation: 3400
I would create a single non-clustered index on each column but include the other column and also the RelationshipType column (I assume RelationshipType is the data you are retrieving):
CREATE NONCLUSTERED INDEX IX_RelationshipType_ParentID
ON
RelationshipType(ParentID)
INCLUDE
(ChildID, RelationshipType)
GO
CREATE NONCLUSTERED INDEX IX_RelationshipType_ChildID
ON
RelationshipType(ChildID)
INCLUDE
(ChildID, RelationshipType)
GO
This will result in the engine being able to get the desired data from the index once it has found the entry, without the need to return to the table after locating the item in the index.
Upvotes: 1
Reputation: 601
In a way, it's impossible to say what would give your database the best performance without knowing the frequency of each of these selects and how often inserts / update to the table occur, but here's my best guess:
It sounds like ParentID, ChildID might be you're primary key, which is by definition a clustered index.
The lazy way to do it here is to create two more non-clustered indexes on ParentID and ChildID. But... particularly the ParentID column, or whatever column comes first in your primary key / cluster index - I'm really not sure if you are going to get any select benefit at all if you create another non-clustered ParentID index. A non-clustered index is going to store a copy of your table ordered by that index, but in this case, the primary key dictates the ordering of your table, and it is already ordered, first, by ParentID.
So in conclusion, I would make a primary key of ParentID and ChildID and a non-clustered index on ChildID and I think you're good to go.
Upvotes: 1
Reputation: 453898
As you say looking up by either ParentId
or ChildId
is highly selective I'd just go for the two separate indexes.
SQL Server can then use either index for the WHERE ParentID = @SomeID AND ChildID = @SomeOtherID
and evaluate the residual predicate on the one or two rows that match.
I suppose an exception would be if the table is more or less read only and the entire database fits in memory then there is no down side of having the additional index and it avoids a lookup to retrieve the missing column.
Upvotes: 1