Reputation: 2199
In an ERD, a weak/non-identifying relationship is one that connects two strong entities, and is indicated with a dashed line. A strong/identifying relationship is one that connects a strong entity to a weak entity (a weak entity is one that contains the foreign key [FK] from its related entity as a component of its own primary key [PK]), and is indicated by a solid line.
My question is, So what? Why is it so important to distinguish between weak/non-identifying relationships versus strong/identifying relationships that ERD designers are supposed to make that distinction with dashed versus solid lines, respectively? Why does it matter so much?
For me, every element and convention in an ERD should add necessary information that either translates directly into the database design (that is, DDL SQL statements), or at least explains information that is important but not necessarily obvious (and example of this last case would be naming the relationships--they do not translate into SQL, but they are very useful for understanding the ERD). Here is a sample ERD for the sake of discussion (modified from another StackOverflow question):
I have considered this a lot, and to me, the only information that solid versus dashed lines add is already adequately conveyed in the following conventions:
As far as I can see, the solid versus dashed relationship line adds no additional useful information. Rather than adding information, this convention is non-intuitive and is very confusing. As just one example of the confusion they cause, there are many duplicate questions here on StackOverflow that ask which is which; here are just a few examples:
Can anyone explain to me what additional information that convention adds that is not contained in the fact that an FK might or might not be part of a PK? I am seriously considering just ignoring the convention completely (that is, I want to start drawing my ERDs with all solid lines), but I would really appreciate it if someone could point out something important that I'm overlooking.
Upvotes: 6
Views: 8228
Reputation: 73
In my opinion an ERD is not so much about some database. It is about me understanding the segment of the world I am dealing with, especially lifecycle of things.
A spreadsheet cell is identified by its row and column. It exists only because both its row and its column exist. As soon as you delete / shred / burn either its row or its column, you have just done the same with its cells. Or you might cut up the sheet with scissors into all its cells, however, at that point they can not be neatly summarized or sorted, their formulae lost their meaning, because they are no longer spreadsheet cells, they lost their such identity.
A green live leaf is identified by the plant it grows on. In fall the leaves fall. You probably could not tell which fell from which tree, they lost their identity of a live leaf, which must have a tree to cling on. Or if you tear off a live leaf, you cannot put it back onto another tree. If you kill/fell the tree, all green live leaves also die.
If you find a christmas decoration on a tree, and later you find something strikingly similar on another tree, it could be the very same decoration moved over. It is not in an identifying relationship with the tree. You can also just put it in the attic with no tree anywhere near.
If you find a leaf growing on a tree, and later a totally identically looking leaf growing on another tree, you can be sure, it is not the same leaf.
I think the difference between the tree's relationship with its leaves and that with its decorations is pretty profound. If you design a computer system keeping record of trees, leaves, and decorations, I think it is useful to indicate this difference on your design diagrams.
Developers will then know what the constructor method signatures will be. There will be no leaf constructor without a mandatory tree argument. But there will be a decoration constructor without a tree argument. There will be no moveToTree method in the leaf class. There could be a moveToTree method in the decoration class. You could optimize your garbage collector to quickly identify all leaf objects whose tree variable points to a dead tree, and reclaim them. This optimization is not applicable to decoration objects.
If you elect to use a relational database for persistence, developers will know that the decoration table's FK to tree table will have to be optional, and not a PK field. But the live leaf table's FK to the tree table will be mandatory, and part of its composite PK.
And contrasting the leaf-tree identifying relationship with a mandatory non-identifying relationship: for a HIV virus specimen it is not optional to have a virus host, outside one it desintegrates. However, it can be transferred to another host, so although mandatory, the relationship is not identifying.
Upvotes: -1
Reputation: 25534
A convention used in ER model diagrams is that referencing (foreign key) attributes are not shown at all unless they are part of a primary key. If referencing attributes are required they are supposed to be implied by the existence of a relationship line. Accordingly, there is no standard or generally agreed ER notation for foreign key attributes even when they are part of a primary key. The case where referencing attributes are needed in order to identify instances of an entity is often called out on ER diagrams by using a dotted relationship line. The motivation here is presumably that "primary" key attributes are deemed to be mandatory and significant so their dependence on other things is also significant.
If your diagram shows foreign key attributes in some other way then the distinction between identfiying/non-identifying relationships is unimportant in my view. Whatever notation you use, ultimately what matters is that your audience understands your diagram correctly.
Upvotes: 2