kulin
kulin

Reputation: 95

composite vs surrogate primary key

I am designing a database with the following requirements:

A structure for this might be:

Organization

 - OrganizationId INT IDENTITY(1,1) NOT NULL PRIMARY KEY

Term

 - TermId INT IDENTITY(1,1) NOT NULL PRIMARY KEY
 - OrganizationId INT NOT NULL REFERENCES Organization(OrganizationId)

SurveyType

 - SurveyTypeId IDENTITY(1,1) NOT NULL PRIMARY KEY
 - OrganizationId INT NOT NULL REFERENCES Organization(OrganizationId)

SurveyForm

 - SurveyFormId INT IDENTITY(1,1) NOT NULL PRIMARY KEY
 - SurveyTypeId INT NOT NULL REFERENCES SurveyType(SurveyTypeId)
 - TermId INT NOT NULL REFERENCES Term(TermId)

That structure keeps with what seems to be a popular emphasis on a single surrogate primary key. However that structure sacrifices data integrity because it is very easy for a SurveyForm record to have a TermId or SurveyTypeId from different Organizations.

To address data integrity, it would seem you would have to add OrganizationId and use it in the composite keys (OrganizationId, SurveyTypeId) and (OrganizationId, TermId). That is somewhat tolerable in this example but as the schema becomes more complete, the composite key sizes increase.

So my question is, how do people generally approach this now (most references online are from 2008 when I think its possible there were different database design concerns)? As a corollary, when is it acceptable to add foreign keys to a table to reduce the number of tables joined for common expressions?

Upvotes: 3

Views: 301

Answers (2)

Roger Wolf
Roger Wolf

Reputation: 7692

Academically speaking, you can migrate the Organization key along both lineages. That's just 4 bytes, after all:

create table dbo.Organization (
    OrganizationId INT IDENTITY(1,1) PRIMARY KEY
);
go

create table dbo.Term (
    TermId INT IDENTITY(1,1) NOT NULL,
    OrganizationId INT NOT NULL REFERENCES dbo.Organization(OrganizationId),
    primary key (OrganizationId, TermId)
);
go

create table dbo.SurveyType (
    SurveyTypeId int IDENTITY(1,1) NOT NULL,
    OrganizationId INT NOT NULL REFERENCES dbo.Organization(OrganizationId),
    primary key (OrganizationId, SurveyTypeId)
);
go

create table dbo.SurveyForm (
    SurveyFormId INT IDENTITY(1,1) NOT NULL,
    OrganizationId int not null,
    SurveyTypeId INT NOT NULL,
    TermId INT NOT NULL,
    primary key (OrganizationId, SurveyTypeId, TermId),
    foreign key (OrganizationId, TermId) references dbo.Term (OrganizationId, TermId),
    foreign key (OrganizationId, SurveyTypeId) references dbo.SurveyType (OrganizationId, SurveyTypeId)
);
go

These tables definitely violate some NF, I don't remember which one exactly, but I'm sure you can handle it yourself.

While this design approach can almost be considered a must for a warehouse (esp. if you aggregate data from different sources), I would never recommend it for any real-life OLTP. Much simpler solution would be:

  • Perform all modifications via a stored procedure, which will have proper checks against this kind of possible discrepancy.
  • Make sure that no user would have permissions to directly add / modify data in the dbo.SurveyForm, thus circumventing the business rules implemented in the aforementioned SP.

Upvotes: 2

Leo
Leo

Reputation: 14820

I think there could be a way to avoid circular references, firstly by defining who really depends on who and removing redundant dependencies.

The question is... are Organizations allowed to be randomly associated to Terms without caring about any Survey association? I wonder if Organizations really need to be associated to a Term directly or indirectly through Surveys. If, for example, an Organization CANNOT be associated to a Term that is not associated to the Organization's Survey then the Organization-Term relationship is useless, if it is the other way around, then the Organization-SurveyType is not needed

Upvotes: 0

Related Questions