Reputation: 58291
In a database, I saw many tables in which the Primary-Key(PK)
is AUTO_INCREMENT
type.
Suppose I have a table Children
created as follows:
CREATE TABLE Children(
childNo INTEGER AUTO_INCREMENT NOT NULL PRIMARY KEY,
name VARCHAR(25),
age INTEGER,
address VARCHAR(100)
)
ChildNo
is AUTO_INCREMENT
, but once I have inserted a row, how do I know which value was assigned for a child (for a name)? and bad choice for PK
.
If I search on child's name it would be inefficient (and not guaranteed to be unique). For this reason I think keeping AUTO_INCREMENT
as primary key denotes weak schema design?
Suppose I have another table Parents
and there I need to keep ChidNo
as Foreign Key (FK)
. Then it would be complex.
If there is a recursive association then keeping PK an AUTO_INCREMENT would be much bad.
Keeping auto-increment field in a relation denotes that normalization is not appropriate?
In some table instead to introduce an extra AUTO_INCREMENT
field I would like to keep all column as PK. Am I wrong?
Because my thoughts are against using AUTO_INCREMENT
, please suggest me also usability to keep AUTO_INCREMENT
field?
Upvotes: 3
Views: 212
Reputation: 563021
The auto-increment PK column can be called a surrogate key.
Using surrogate keys can be a helpful optimization in some cases:
So there are legitimate benefits for surrogate keys.
That said, surrogate keys are often over-used. Many application frameworks (e.g. Ruby on Rails) use a default that every table has an integer surrogate key named ID
regardless of whether it's appropriate. You can specify a PK column on a table by table basis, but many programmers take the default as a rule, and this leads them to have some senseless table designs. The worst example I've seen is for every many-to-many table to have a superfluous ID
column.
For what it's worth, using a surrogate key has nothing to do with normalization. That is, rules of normalization neither encourage nor discourage using surrogate keys.
Every database that supports surrogate keys also provides a function that returns the most recently generated id value in the current session. As @JStead mentioned, in SQL Server it's @@IDENTITY
or SCOPE_IDENTITY()
. In MySQL, it's LAST_INSERT_ID()
. And so on.
These functions return only a single value, so you can't get all the generated id values if you insert multiple rows in a single INSERT statement. That's a limitation.
Upvotes: 3
Reputation: 1730
I would agree that is over used but it does have it's place in the RDBMS world. You didn't specify a database and this is more of a philosophical debate then any implementation so I will use sql server in my example below.
The best case for an auto_increment or identity is it much more efficient and easier to use than natural keys in a lot of case where the are wide or multi column natural keys.
Lets take the below table.
CREATE TABLE TABLE_OBJECT (
Table_ID int identity(1,1),
Server_NME varchar(128),
Database_NME varchar(128),
Table_NME varchar(128)
)
CREATE TABLE COLUMN_OBJECT (
Column_ID int identity(1,1),
Table_ID int not null,
Server_NME varchar(128),
Database_NME varchar(128),
Table_NME varchar(128),
Column_NME varchar(128)
)
Now lets say in this scenario we want to join the two table together and we don't have the identities.
select * from TABLE_OBJECT to
inner join COLUMN_OBJECT co on co.Server_NME = to.Server_NME
and co.Database_NME = to.Database_NME
and co.Table_NME = to.Table_NME
On top of that sucking to write it is also very inefficient I had to read 6*(128) bytes to compare one row.
Now compare that to the simplicity of the following.
select * from TABLE_OBJECT to
inner join COLUMN_OBJECT co on co.Table_id = to.Table_ID
In the above example I only had to read 2*(4) bytes to compare one row. That is a huge difference when you have a lot of rows.
Then there is the plain storage side too.
CREATE TABLE COLUMN_OBJECT (
Server_NME varchar(128),
Database_NME varchar(128),
Table_NME varchar(128),
Column_NME varchar(128)
)
versus
CREATE TABLE COLUMN_OBJECT (
Column_ID int identity(1,1),
Table_ID int not null,
Column_NME varchar(128)
)
The storage on this is identity version 2*(4) + 128 bytes versus natural key version 4*128 bytes.
Uniqueness can also still be guaranteed with a unique constraint on the table_id and the column_nme. Then in the parent table a unique constraint on table_nme,database_nme,server_nme. To be honest though I would have created a database table as well likely and table would have just a unique constraint on database_id and table_nme but you get the idea. If the proper columns are selected for unique indexes uniqueness should never be an issue.
Getting the value of a previously inserted identity or auto_incremenet is trivial in most languages as well.
select @@IDENTITY
or
select LAST_INSERT_ID()
Every language has a way to get the last one.
Upvotes: 2