roryok
roryok

Reputation: 9645

Is a Primary Key necessary in SQL Server?

This may be a pretty naive and stupid question, but I'm going to ask it anyway

I have a table with several fields, none of which are unique, and a primary key, which obviously is.

This table is accessed via the non-unique fields regularly, but no user SP or process access data via the primary key. Is the primary key necessary then? Is it used behind the scenes? Will removing it affect performance Positively or Negatively?

Upvotes: 75

Views: 71751

Answers (10)

onedaywhen
onedaywhen

Reputation: 57023

In the logical model, a table must have at least one key. There is no reason to arbitarily specify that one of the keys is 'primary'; all keys are equal. Although the concept of 'primary key' can be traced back to Ted Codd's early work, the mistake was picked up early on has long been corrected in relational theory.

Sadly, PRIMARY KEY found its way into SQL and we've had to live with it ever since. SQL tables can have duplicate rows and, if you consider the resultset of a SELECT query to also be a table, then SQL tables can have duplicate rows too. Relational theorists dislike SQL a lot. However, just because SQL lets you do all kinds of wacky non-relational things, that doesn't mean that you have to actually do them. It is good practice to ensure that every SQL table has at least one key.

In SQL, using PRIMARY KEY on its own has implications e.g. NOT NULL, UNIQUE, the table's default reference for foreign keys. In SQL Server, using PRIMARY KEY on its own has implications e.g. the table's clustered index. However, in all these cases, the implicit behavior can be made explicit using specific syntax.

You can use UNIQUE (constraint rather than index) and NOT NULL in combination to enforce keys in SQL. Therefore, no, a primary key (or even PRIMARY KEY) is not necessary for SQL Server.

Upvotes: 5

Thomas Weller
Thomas Weller

Reputation: 11717

A PK is not necessary.

But you should consider to place a non-unique index on the columns that you use for querying (i.e. that appear in the WHERE-clause). This will considerably boost lookup performance.

Upvotes: 3

marr75
marr75

Reputation: 5715

As SQLMenace said, the clustered index is an important column for the physical layout of the table. In addition, having a clustered index, especially a well chosen one on a skinny column like an integer pk, actually increases insert performance.

Upvotes: 1

Mitch Wheat
Mitch Wheat

Reputation: 300559

A primary key is really a property of your domain model, and it uniquely identifies an instance of a domain object.

Having a clustered index on a montonically increasing column (such as an identity column) will mean page splits will not occur, BUT insertions will unbalance the index over time and therefore rebuilding indexes needs to be done regulary (or when fragmentation reaches a certain threshold).

I have to have a very good reason to create a table without a primary key.

Upvotes: 1

Remus Rusanu
Remus Rusanu

Reputation: 294287

Do you have any foreign keys, do you ever join on the PK?

If the answer to this is no, and your app never retrieves an item from the table by its PK, and no query ever uses it in a where clause, therefore you just added an IDENTITY column to have a PK, then:

  • the PK in itself adds no value, but does no damage either
  • the fact that the PK is very likely the clustered index too is .. it depends.

If you have NC indexes, then the fact that you have a narrow artificial clustered key (the IDENTITY PK) is helpful in keeping those indexes narrow (the CDX key is reproduced in every NC leaf slots). So a PK, even if never used, is helpful if you have significant NC indexes.

On the other hand, if you have a prevalent access pattern, a certain query that outweighs all the other is frequency and importance, or which is part of a critical time code path (eg. is the query run on every page visit on your site, or every second by and app etc) then that query is a good candidate to dictate the clustered key order.

And finally, if the table is seldom queried but often written to then it may be a good candidate for a HEAP (no clustered key at all) since heaps are so much better at inserts. See Comparing Tables Organized with Clustered Indexes versus Heaps.

Upvotes: 19

SQLMenace
SQLMenace

Reputation: 135011

The primary key is behind the scenes a clustered index (by default unless generated as a non clustered index) and holds all the data for the table. If the PK is an identity column the inserts will happen sequentially and no page splits will occur.

But if you don't access the id column at all then you probably want to add some indexes on the other columns. Also when you have a PK you can setup FK relationships

Upvotes: 5

ProfessionalAmateur
ProfessionalAmateur

Reputation: 4563

If you are accessing them via non-key fields the performance probably will not change. However it might be nice to keep the PK for future enhancements or interfaces to these tables. Does your application only use this one table?

Upvotes: 0

Aaron Digulla
Aaron Digulla

Reputation: 328604

Necessary? No. Used behind the scenes? Well, it's saved to disk and kept in the row cache, etc. Removing will slightly increase your performance (use a watch with millisecond precision to notice).

But ... the next time someone needs to create references to this table, they will curse you. If they are brave, they will add a PK (and wait for a long time for the DB to create the column). If they are not brave or dumb, they will start creating references using the business key (i.e. the data columns) which will cause a maintenance nightmare.

Conclusion: Since the cost of having a PK (even if it's not used ATM) is so small, let it be.

Upvotes: 54

HLGEM
HLGEM

Reputation: 96572

I would never have a table without a primary key. Suppose you ever need to remove a duplicate - how would you identify which one to remove and which to keep?

Upvotes: 3

Tom Gullen
Tom Gullen

Reputation: 61727

The primary key when defined will help improve performance within the database for indexing and relationships.

I always tend to define a primary key as an auto incrementing integer in all my tables, regardless of if I access it or not, this is because when you start to scale up your application, you may find you do actually need it, and it makes life a lot simpler.

Upvotes: 2

Related Questions