Reputation: 5200

Should identity primary keys be always non-clustered?

Assuming a table with an auto-increment identity such as:

CREATE TABLE [dbo].[Category]
(
    [CategoryId] INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
    [Title] NVARCHAR(50) NOT NULL,
    [Description] NVARCHAR(128) NULL
)

Since the auto-increment PK makes the table inherently sorted, does it mean that I should make the CategoryId Non Clustered and then create a clustered index on the other two columns CREATE CLUSTERED INDEX [IX_Category_Column] ON [dbo].[JobCategory] ([Title], [Description])?

Upvotes: 3

Answers (3)

Marek Vitek

Reputation: 1633

Clustered Index means that all data in the table are sorted according such index. When you created PrimaryKey, you created that Clustered Index. In each table can be only one Clustered Index.
So you create nonclustered index on other two columns as you need based on queries you will run against the table.
Also note that Clustered Index should be as narrow as possilbe. Reason for it is that it is included in all other indexes. So when you create index on Title column, it will also contain CategoryId column even if you don't specify it.

When creating index you should also consider another aspect. Columns can be part of the index or just "included". It means that it is included in the index but data are not sorted using this column. It can come handy, when you want column in your index, that you will not use in where clause or in join but will be output of your query. Especially when data in this column change frequently.

Let's take your table and add some data to it

insert into Category (Title, Description) values ('Title2', 'Description2_B')
insert into Category (Title, Description) values ('Title2', 'Description2_A')
insert into Category (Title, Description) values ('Title1', 'Description1_B')
insert into Category (Title, Description) values ('Title1', 'Description1_A')

Now create index on both Title and Description columns

create nonclustered index idx_category_title on Category (title, Description)

Running select on this table will give you

select Title, Description from category where title Like 'Title%'

Results:
    |  Title |    Description |
    |--------|----------------|
    | Title1 | Description1_A |
    | Title1 | Description1_B |
    | Title2 | Description2_A |
    | Title2 | Description2_B |

As you can see result is sorted by Title first, then by Description. With this index every time you modify Description your index will have to be updated to have data sorted.

Now Let's try same table and same data but with index, where column Description is "included"

create nonclustered index idx_category_title on Category (title) include (Description)

Running same select on this setup will give you

select Title, Description from category where title Like 'Title%'

Results:
    |  Title |    Description |
    |--------|----------------|
    | Title1 | Description1_B |
    | Title1 | Description1_A |
    | Title2 | Description2_B |
    | Title2 | Description2_A |

As you can see data are sorted by Title, but not by Description. The real performance gain here is when you modify Description. As the index is not sorted using this column, changing it will not change position of the records in the index.

Upvotes: 2

SqlZim

Reputation: 38063

Since the auto-increment PK makes the table inherently sorted

That is not true. The clustering key is what determines how rows are sorted when stored.

In general, the clustering key should be narrow (e.g. int or bigint) and ever-increasing (like identity()).

You should also consider giving your constraints and indexes more sensible names rather than accepting whatever name would be automatically generated by sql server.

create table dbo.Category (
    CategoryId int identity(1,1) not null
  , Title nvarchar(50) not null
  , Description nvarchar(128) null
  , constraint pk_Category_CategoryId primary key clustered (CategoryId)
);

To support queries like:

select CategoryId, Title, Description 
from dbo.Category
where Title = 'MyTitle';

You would create an additional nonclustered index like the following:

create nonclustered index ix_Category_Title 
  on dbo.Category (Title)
    include (Description);

Reference:

Upvotes: 1

Rahul

Reputation: 77926

PRIMARY KEY by default creates a clustered index on that PK column and it's that index which stores the physical sorting order and not the IDENTITY column by itself I believe.

Thus, you should create a Non-clustered index on the rest column if needed.

Indexes on other column would definitely help increase query performance if you intend to use those cols in filter condition or join condition etc. In which case, it will perform a index scan instead of table scan. To experience the same, run a query which involve the other columns in such condition with and without having index on them. Get the actual query execution plan and see for yourself.

Upvotes: 3

Should identity primary keys be always non-clustered?

Answers (3)

Related Questions