Waqar Ahmed
Waqar Ahmed

Reputation: 1444

How Non-Clustered affects Clustered Index (PK)

Columns          | Data Type     | Allow Null
-----------------|---------------|-----------
Id (PK)          | int           | Unchecked
                 |               |  
Type             | tinyint       | Unchecked
                 |               |
Severity         | tinyint       | Unchecked
                 |               |
Message          | varchar(1000) | Unchecked
                 |               |
ReferenceTypeId  | int           | Checked
                 |               |
ReferenceId      | int           | Checked
                 |               |
ParentId         | int           | Checked
                 |               |
ParentTypeId     | int           | Checked
                 |               |
Created          | datetime      | Checked

Okay, in the above table I have 10996674 records. When i run the following Query select count(id) from DTL_Event it takes 1:23 mins.

After applying Indexing

CREATE NONCLUSTERED INDEX DTL_Event_Optimization_Date on DTL_Event([Created])
CREATE NONCLUSTERED INDEX DTL_Event_Optimization_TypeIds_Composite on DTL_Event([ReferenceTypeId],[ParentTypeId])
CREATE NONCLUSTERED INDEX DTL_Event_Optimization_Reference_Parent_Ids_Composite on DTL_Event([ReferenceId],[ParentId])

The same query select count(id) from DTL_Event takes 1 sec. I need to know why, because Id is a primary key which supposed to be indexed already and I am applying indexing on other columns thats why i have a Question:

Why NonClustered Indexing on other Columns optimizes this query select count(id) from DTL_Event. i.e. Why applying Nonclustered indexing on other columns helps me faster retrieval of count of Id(s)

Upvotes: 0

Views: 91

Answers (2)

Neeraj Prasad Sharma
Neeraj Prasad Sharma

Reputation: 1675

After adding a nonclustered index QO is choosing one of non clustered clustered index simply because that is most efficient way to do because of "page count" in the index.

QO might choose (most of the time) NON-clustered index with the lowest page count overall.

As you might know nonclustered indexes has unique identifier for lookup in the other indexes/heap in the leaf level.

Why NonClustered Indexing on other Columns optimizes this query select count(id) from DTL_Event. i.e. Why applying Nonclustered indexing on other columns helps me faster retrieval of count of Id(s)

In your case that is you primary (clustered and unique) key ID, so every nonclustered has ID column in the leaf level for example you create a nonclustered index like

>   CREATE NONCLUSTERED INDEX DTL_Event_Optimization_Date on
> DTL_Event([Created])

Then it will automatically include column ID

>   CREATE NONCLUSTERED INDEX DTL_Event_Optimization_Date on
> DTL_Event([Created]) 
> INCLUDE (ID)

So actually after adding nonclustered index its quite efficient to look on one of the nonclustered index for the rowcount instead of looking on to clustered index you can look into the same in execution plan

TEST RIG :

CREATE Table DTL_Event (ID INT not null , TYPE tinyint not null  , Severity tinyint  not null ,[Message] varchar(1000) not null , ReferenceTypeId int ,ReferenceId int

 ,ParentId int , ParentTypeId int ,Created datetime   , primary key (ID))

Without Non-clustered indeX

  select count(id) from DTL_Event

EXECUTION PLAN enter image description here

WITH NONCLUSTERED INDEX

 CREATE NONCLUSTERED INDEX DTL_Event_Optimization_Date on DTL_Event([Created])



select count(id) from DTL_Event

EXECUTION PLAN enter image description here

BUT if your goal is to count total number of rows present in the table best way is to do that

SELECT SUM (row_count)
FROM sys.dm_db_partition_stats
WHERE object_id=OBJECT_ID('Transactions')   
AND (index_id=0 or index_id=1);

Upvotes: 1

Ben Thul
Ben Thul

Reputation: 32667

Generally speaking, if your pk is also the clustered index, it will also live at the leaf level of any non-clustered indexes. Conversely, all of the table data lives at the leaf level of the clustered index. So, if you're just asking for a count, one of your non-clustered indexes is likely going to be narrower than the clustered and can satisfy the query, so the optimizer will choose that. However, if you add a where clause, all bets are off.

Upvotes: 1

Related Questions