VansFannel
VansFannel

Reputation: 45921

A column as primary key or two foreign keys as primary key

I have the following database design:

enter image description here

An E-Report has one QAP which has some Requirements. A QAP and its Requirements can be used in more than one E-Report.

Every Requirement will have a Yes/No confirmation in each E-Report. I've added EReportReq to store requirements confirmations values (users will set these values).

And also, each Requirement will have more than one Image on each E-Report. EReportReqImg will store Image and Requirement relationship.

If you need more details about this database model, please tell me.

My question is about EReportReq table. I'm not sure if I need a column as primary key (EReportReqId) or I can use eReportId and requirementId as primary key.

If I use these two columns, eReportId and requirementId as primary key, I will need to add these two to EReportReqImg table, so I don't know if this approach is better than mine.

What do you think?

Upvotes: 0

Views: 1005

Answers (2)

Branko Dimitrijevic
Branko Dimitrijevic

Reputation: 52107

My question is about EReportReq table. I'm not sure if I need a column as primary key (EReportReqId) or I can use eReportId and requirementId as primary key.

You can use either of them - none of them is absolutely "better". Just be careful that if you decide to use the first approach, also create a UNIQUE constraint on {eReportId, requirementId}.

The fist approach (with non-identifying relationship and surrogate key) leads to:

  • "leaner" foreign keys in child tables (which is EReportReqImg in this case) - as you already noted,
  • the cascading ON UPDATE doesn't propagate to children (so if you update EReport.eReportId, only EReportReq.eReportId is cascade-updated, but not EReportReqImg.eReportId)
  • and can be more friendly to ORMs.

On the other hand, the second approach (with identifying relationship and natural keys):

  • has potentially less need for JOINs (e.g. you don't need to EReportReqImg JOIN EReportReq just to find-out requirementId - you have it directly in EReportReqImg.requirementId),
  • is better suited for clustered tables (e.g. EReportReq rows with the same eReportId will be stored physically "close", which may significantly benefit some queries)
  • avoids additional index on the surrogate key.

Since you have a small number of child tables, "fat" FKs don't matter much and since we are dealing with IDs, they are unlikely to change and cascading ON UPDATE is unlikely to be a problem. So, my instinct is to go with the second approach, but you might have some other considerations that might tip your decision in a different direction...

Upvotes: 2

Dewfy
Dewfy

Reputation: 23624

Let's start from this state:

I will need to add these two to EReportReqImg

In general usage of 2 FK as PK is normal practice for non-changeable data. So if EReportReq is not supposed to be changed in manner that you will drag it to another requirementId or eReportId then use compound key. Otherwise it is more robust and efficient to use single-value primary key - because it is not changed during the time and as result you don't need write trigger or use tricky cascade to update children tables.

Other option to review is simplicity of result SQL, simple better than complex - write INNER JOIN with 2 fields is complex construction and there is potentially bug-place for miss one of the keys.

Upvotes: 0

Related Questions