Mohit Jain
Mohit Jain

Reputation: 43939

Why single primary key is better than composite keys?

Why is the rejection of composite keys in favor of all tables using a single primary key named id? Cause generally all ORM follow this.

EDIT

I just started learning ruby on rails and in the book of agile development by pragmatic there is a line:--- Rails really don't work too well unless each table has a numeric primary key. It is less fussy about the name of the column. Same kind of line I read when I was learning Doctrine.

EDIT2 Please check this link too. I am getting more and more confused about this thing:--- Composite primary keys versus unique object ID field

From the above link:--

*the primary key should be constant and meaningless; non-surrogate keys usually fail one or both requirements, eventually

If the key is not constant, you have a future update issue that can get quite complicated if the key is not meaningless, then it is more likely to change, i.e. not be constant; see above

Take a simple, common example: a table of Inventory items. It may be tempting to make the item number (SKU number, barcode, part code, or whatever) the primary key, but then a year later all the item numbers change and you're left with a very messy update-the-whole-database problem...

EDIT: there's an additional issue that is more practical than philosophical. In many cases you're going to find a particular row somehow, then later update it or find it again (or both). With composite keys, there is more data to keep track of and more constraints in the WHERE clause for the re-find or update (or delete). It is also possible that one of the key segments may have changed in the meantime!. With a surrogate key, there is always only one value to retain (the surrogate ID) and by definition, it cannot change, which simplifies the situation significantly.*

Upvotes: 31

Views: 30472

Answers (9)

KM.
KM.

Reputation: 103587

I don't think there is a blanket statement that you should only ever use a single primary key named id.

Most people use a surrogate primary key as an auto generate int, because it isolates the primary key from ever needing to be changed, like if you make the PK the user name and they later changed their legal name. You would have to update the PK and all FK columns to reflect the new name. if you had used a surrogate primary key, you just update the user's name in one spot (because the tables join on the int not the name).

The size of a primary key is important because the PK is duplicated into every index you build on the table. If the PK is large (like a string) you have fewer keys per page in the index and the index will take more cache memory to store it. Ints are small.

Having a auto increment int PK lends itself to being a clustered index well, as rows are stored in this order and there is no need to go back and bump rows out of the way to insert a new row, you always add to the table's end.

Upvotes: 40

Larry Lustig
Larry Lustig

Reputation: 50970

Although I agree with most of the reasons given by other respondents, my primary reason for preferring a single-column integer key is that it makes writing a user interface much, much easier.

If you are using some kind of list control to represent your data (a list, list view, combo box, etc) you can uniquely relate each entry back to its database representation through a single integer value stored with the item. Most pre-written components already allow you to attach an integer to each item and for those that don't, it's very easy to extend the component to do so.

If you're passing data between a server application and a web page, it's much easier to store the single identifying value in the the id attribute of the widget that represents the data than to have to compose and parse multi-value ids.

Upvotes: 4

MatthewMartin
MatthewMartin

Reputation: 33143

I worked on an app with a 11 column primary key. It was always great fun retyping the list over and over and over every time I wanted to guarantee I was updating one row. It was a driver of bugs, MS-Access couldn't cope with more than 10 columns in a PK, etc.

Large composite keys are design smells that mean the table holds heterogenous entities or the designer wasn't really sure what it is that is unique about each entity. (Like assuming that hair color, eye color and body weight should be enough to unique identify an employee-- which isn't a good key because you'd need more and more and more columns to make it work and eventually that will include fields that are volatile and change a lot, like weight, or for some people hair color or lack there of.)

Upvotes: 5

Jeffrey L Whitledge
Jeffrey L Whitledge

Reputation: 59463

The only real limitation that I have run into using composite keys regards using an IN expression with a subquery. This is a problem, because a subquery in an IN expression must return a single column (at least in T-SQL).

SELECT
    emp.Name,
    emp.UserDomain,
    emp.UserID
FROM
    employee emp
WHERE
    ???? IN (SELECT e.UserDomain, e.UserID FROM ... /* some complex 
                                                       non-correlated subquery 
                                                       or CTE */
            )

There are always work-arounds, of course, but sometimes it could be an annoyance.

This is hardly a reason to avoid a composite key in places where it makes sense to use one.

Upvotes: 18

leonbloy
leonbloy

Reputation: 75916

Your question is strongly related to the surrogate (or artificial) keys vs natural keys alternative. I think it's not that composite keys are less used, but that natural keys (be them composite or simple) are less favoured than artificial keys.

Traditional relational database theory dealt mostly with "natural" keys, (the ones which have meaning from the business-domain point of view) and in that scenario composite keys are frequently found... naturally.

But in the later years, database design has favoured (though not exclusively) the "artificial" (surrogate) key pattern, typically a sequential number that has no business meaning, only serves to uniquely identifies the record in the table (and perhaps the object in the upper layer).

Upvotes: 0

marc_s
marc_s

Reputation: 754508

Well, it's basically about keeping JOINs simple - which one is simpler to understand:

SELECT
   p.ID, p.Name, p.City,
   c.ID, c.Country, c.ISOCode
FROM
   dbo.Parent p
INNER JOIN
   dbo.Child c on c.ParentID = p.ID

or

SELECT
   p.ID, p.Name, p.City,
   c.ID, c.Country, c.ISOCode
FROM
   dbo.Parent p
INNER JOIN
   dbo.Child c ON c.ParentName = p.Name
     AND c.ParentCity = p.City
     AND c.ParentCountry = p.Country

If you have composite primary keys, anyone joining to your table from a child table must "drag along" all those columns, and all those columns are also going to be present in the child table and the JOIN statements are pretty messy. Much better to have a single (even surrogate) key for the JOIN!

Upvotes: 8

reinierpost
reinierpost

Reputation: 8591

Objects in OO programming have identity regardless of their contents. Rows (tuples) in relational databases are identified by their contents only. So when really doing ORM, i.e. mapping objects from an object-oriented programming language to a relational database, an extra ID must be provided as distinct from the fields and/or properties the object has in the program - unless one or more of those are somehow known to identify objects uniquely.

Upvotes: 0

James Westgate
James Westgate

Reputation: 11444

You can use both. In some cases when making an association between an entity you can use both entity keys as a composite key.

As a rule of thumb I use generated ids for entities and composite keys for relationships.

Upvotes: 10

Exception e
Exception e

Reputation: 1874

  1. For an ORM a single identifying column with a consistent name like table_id is easier than a composite key. But every good ORM support composite keys.

  2. An simple PK can easily be 'autoincremented' by the database. This doesn't hold for a composite key.

  3. A simple PK is also easier to use in queries. When you need to join, you only have to use one column of both relations.

This is not to say that simple PKs are better than composite ones, though.

Upvotes: 1

Related Questions