Ram Das
Ram Das

Reputation: 358

Indexes on table which has duplicate values in sql

I have two tables

  1. Main table PostInfo columns like( PostID,PostSubject,PostedDate,IsProcessed).One entry for each Post ID

  2. Transaction Table Post_Transaction

     PostID   PostComments     Post_InsertedOn               UserID
     --------------------------------------------------------------------
     10000    VRDFHFGFTR     2013-10-26 21:08:19.817          43434
     10000    GFDGDFSDFF     2013-10-26 21:12:32.323          67576
     10000    HGFHGFBNBF     2013-10-26 21:43:43.545          3232
     10000    JNFNGHFGHG     2013-10-26 21:45:46.656          768
     10000    MJHJNGJHGH     2013-10-26 21:56:32.767          9897
     10001    XCVGFDGDFG     2013-10-26 22:54:54.868          3424
     10001    YTUGFGHHGF     2013-10-26 13:32:54.132          12313
     10001    HGFHFGHGHF     2013-10-26 18:08:32.878          6565
    

We have clustered unique index on PostID column in PostInfo table.

Clustered non unique index on PostID column in Post_Transaction table. How far this non unique clustered index(Post_Transaction) would work efficiently as there will be 50000+ entries for each PostID in Post_Transaction table.

Or

do i need to create an identity column(IdentID) in Post_Transaction table.so that i can create Clustered unique index on that IdentID column,Non clustered non unique index on PostID in Post_Transaction table.

Which way i need to proceed,Please suggest. :-)

Upvotes: 1

Views: 1032

Answers (1)

ARA
ARA

Reputation: 1316

Adding an identity only to cluster on it could be a bad choice.

With a non unique clustered index, Sql will add an hidden uniquifier to each row (4 bytes), that's not a big problem. The choice of the cluster key is done according to queries on this table plus the fact that is should be ordered in order not to split too much and to be retrieved in ranges (identities or dates are often choosen) according to the application. If you search frequently on PostInsertedOn it could be a good choice. If queries are more ofen done on PostId it is better for it to be first. I guess PostTransactions are often search by PostId+PostInsertedOn, that could be a good choice (non unique if PostInsertedOn can duplicate).

But the best way is to check that with the queries of your application and the query plans.

Also keep in mind that the cluster key will be an hidden part of all other indexes added as the lookup key of your table.

Upvotes: 1

Related Questions