Reputation: 914
I'm trying to get an idea of what approach to take for a data model I'm creating. I have two tables which I originally created using composite keys. I am now adding a 3rd table which is a join table of the previous two, which would result in a composite key of three fields and two foreign keys with 2 fields each. This will likely be in MySQL and be used along with some sort of Java persistence framework. I've always liked using composite keys because it seems like a more natural way to represent the data, but I want to make sure I'm not setting myself up for a world of hurt later on. Should I keep going with the approach mentioned or just create some auto incrementing ids on the tables?
Upvotes: 3
Views: 674
Reputation: 39
Surrogate keys for join tables have a HUGE drawback when it comes to constraining deeper relationships with foreign keys. We'll need a 6-table setup to demonstrate it.
The base tables:
CREATE TABLE semester (semester_id INTEGER PRIMARY KEY, semester_name VARCHAR(40));
CREATE TABLE student (student_id INTEGER PRIMARY KEY, student_name VARCHAR(40));
CREATE TABLE subject (subject_id INTEGER PRIMARY KEY, subject_name VARCHAR(40));
Then let's connect them:
CREATE TABLE enrollment (
enrollment_id INTEGER PRIMARY KEY,
semester_id INTEGER NOT NULL,
student_id INTEGER NOT NULL,
room_number INTEGER,
FOREIGN KEY (semester_id) REFERENCES semester (semester_id),
FOREIGN KEY (student_id) REFERENCES student (student_id),
UNIQUE INDEX (semester_id, student_id)
);
-- similarly ...
CREATE TABLE class(class_id ..., semester_id ..., subject_id ..., class_number ...);
So far, everything's awesome. However, we need to connect them some more:
CREATE TABLE grades (
student_in_class_id INTEGER PRIMARY KEY,
enrollment_id INTEGER NOT NULL,
class_id INTEGER NOT NULL,
grade char(1),
FOREIGN KEY enrollment (enrollment_id),
FOREIGN KEY class (class_id),
UNIQUE INDEX (enrollment_id, class_id)
);
The problem: How on earth are we supposed to enforce that both the enrollment and the class would reference the same semester? (short answer: we can't)
The same with composite keys -- and here I don't need an extra key for enforcing UNIQUE combinations for the FOREIGN KEYs, the PRIMARY KEY does that for me by default:
CREATE TABLE enrollment (
semester_id INTEGER NOT NULL,
student_id INTEGER NOT NULL,
room_number INTEGER,
PRIMARY KEY (semester_id, student_id),
FOREIGN KEY (semester_id) REFERENCES semester (semester_id),
FOREIGN KEY (student_id) REFERENCES student (student_id)
);
-- along the same lines...
class(semester_id ..., subject_id ..., class_number ...)
Then the ominous grades table:
CREATE TABLE grades (
semester_id INTEGER NOT NULL,
student_id INTEGER NOT NULL,
subject_id INTEGER NOT NULL,
PRIMARY KEY (semester_id, student_id, subject_id),
FOREIGN KEY (semester_id, student_id) REFERENCES enrollment(semester_id, student_id),
FOREIGN KEY (semester_id, subject_id) REFERENCES class(semester_id, subject_id)
);
This way, I have my relationships properly expressed and constrained.
Additional bonus: I can get my semester_name
, student_name
and subject_name
values with simple joins (instead of having to traverse the schema through multiple levels.) It's likely that you will have more levels than this simplest possible example, and then it will be even more pronounced how easy it is to reach back to any of the parent relations.
Changing the schema is not more complicated, but even if it was: isn't data modeling about modeling data? Can we throw out referential integrity just because we adore surrogate keys?
Note: Hibernate does handle composite keys. So does DataMapper for Ruby.
Upvotes: 2
Reputation: 691765
Hibernate recommends using purely technical, auto-generated, non-composite keys (except for pure join tables). And for very good reasons, IMO.
Your mapping will be much harder with composite keys. The performance will be reduced because of more complex indices. The general programming will be harder because instead of being able to identify a given entity with just a long, you'll need two or three longs (think about URLs and forms, for example, where you'll have to put two or three parameters/hidden fields to your URLs/forms).
And of course, if the keys are functional, it's even worse, because you WILL have to change one of the parts of a primary key some time, which will force you to update all the references to this primary key.
Upvotes: 2
Reputation: 425391
If your many-to-many is pure relationship (without attributes on its own) and will never be referenced to on itself, just use the composite key.
If it may have attributes on its own or be referenced (like instantiated as a class), you may want to have a surrogate key, since many ORM
's require the id
to be a single integer.
Upvotes: 2