Reputation: 95
I am designing a database with the following requirements:
A structure for this might be:
Organization
- OrganizationId INT IDENTITY(1,1) NOT NULL PRIMARY KEY
Term
- TermId INT IDENTITY(1,1) NOT NULL PRIMARY KEY
- OrganizationId INT NOT NULL REFERENCES Organization(OrganizationId)
SurveyType
- SurveyTypeId IDENTITY(1,1) NOT NULL PRIMARY KEY
- OrganizationId INT NOT NULL REFERENCES Organization(OrganizationId)
SurveyForm
- SurveyFormId INT IDENTITY(1,1) NOT NULL PRIMARY KEY
- SurveyTypeId INT NOT NULL REFERENCES SurveyType(SurveyTypeId)
- TermId INT NOT NULL REFERENCES Term(TermId)
That structure keeps with what seems to be a popular emphasis on a single surrogate primary key. However that structure sacrifices data integrity because it is very easy for a SurveyForm
record to have a TermId
or SurveyTypeId
from different Organization
s.
To address data integrity, it would seem you would have to add OrganizationId
and use it in the composite keys (OrganizationId, SurveyTypeId)
and (OrganizationId, TermId)
. That is somewhat tolerable in this example but as the schema becomes more complete, the composite key sizes increase.
So my question is, how do people generally approach this now (most references online are from 2008 when I think its possible there were different database design concerns)? As a corollary, when is it acceptable to add foreign keys to a table to reduce the number of tables joined for common expressions?
Upvotes: 3
Views: 301
Reputation: 7692
Academically speaking, you can migrate the Organization key along both lineages. That's just 4 bytes, after all:
create table dbo.Organization (
OrganizationId INT IDENTITY(1,1) PRIMARY KEY
);
go
create table dbo.Term (
TermId INT IDENTITY(1,1) NOT NULL,
OrganizationId INT NOT NULL REFERENCES dbo.Organization(OrganizationId),
primary key (OrganizationId, TermId)
);
go
create table dbo.SurveyType (
SurveyTypeId int IDENTITY(1,1) NOT NULL,
OrganizationId INT NOT NULL REFERENCES dbo.Organization(OrganizationId),
primary key (OrganizationId, SurveyTypeId)
);
go
create table dbo.SurveyForm (
SurveyFormId INT IDENTITY(1,1) NOT NULL,
OrganizationId int not null,
SurveyTypeId INT NOT NULL,
TermId INT NOT NULL,
primary key (OrganizationId, SurveyTypeId, TermId),
foreign key (OrganizationId, TermId) references dbo.Term (OrganizationId, TermId),
foreign key (OrganizationId, SurveyTypeId) references dbo.SurveyType (OrganizationId, SurveyTypeId)
);
go
These tables definitely violate some NF, I don't remember which one exactly, but I'm sure you can handle it yourself.
While this design approach can almost be considered a must for a warehouse (esp. if you aggregate data from different sources), I would never recommend it for any real-life OLTP. Much simpler solution would be:
Upvotes: 2
Reputation: 14820
I think there could be a way to avoid circular references, firstly by defining who really depends on who and removing redundant dependencies.
The question is... are Organization
s allowed to be randomly associated to Term
s without caring about any Survey
association? I wonder if Organization
s really need to be associated to a Term
directly or indirectly through Survey
s. If, for example, an Organization
CANNOT be associated to a Term
that is not associated to the Organization
's Survey
then the Organization-Term relationship is useless, if it is the other way around, then the Organization-SurveyType is not needed
Upvotes: 0