Reputation: 15817
My question relates to 0 to many relationships. For example, if you have a customer table and an Order table and a customer can have one or many orders and an order can have zero or one customer. The reason it can have zero is because orders were imported from a legacy order system where customer details where not held.
I believe there are two options:
Which is the best alternative?
EDIT:
I have discovered the following post and agree with Molf's answer, which is option 2 in my original question: Any example of a necessary nullable foreign key?.
Upvotes: 1
Views: 1072
Reputation: 89741
I would not create the junction table, unless you are initially going to restrict the order key to be unique and/or foresee the relationship becoming a many-to-many.
The question in the simpler two-table option with an ordinary single FK is whether you want to allow a NULL customer FK in the order rows. It is certainly possible to not allow this (for obvious good referential integrity reasons going forward) and have an "UNKNOWN CUSTOMER" customer which is used for all legacy orders with unknown customers. In a very good way, this replaces a NULL (which can be misinterpreted) with something explicit (the reference to an "UNKNOWN CUSTOMER", which can always be fixed once the real customer is created or identified).
I would strongly consider that, because one would hope that the legacy data would be a relatively smaller and smaller portion of your data going forward and you want to ensure as much RI in your data for the typical case (all cases going forward MUST identify a customer during order creation) while still accommodating the legacy cases (which diminish in significance in your data universe over time)
Upvotes: 3
Reputation: 13289
You have a redundancy here that is conceptually confusing, though you may have a requirement for this. If you did this from scratch, you could simply have Orders have a NULLable FK of Customer.CustomerID. This would satisfy the having 1 customer - many orders and 1/0 orders to a customer. You could also do it by having the junction table, since this is technically a many-to-many relationship.
The best alternative is going to be dependent on how you are querying the data. For example, doing analytical-type queries on your Orders table might be more convenient
Take finding orders with no customer
SELECT SUM(CASE WHEN customer iS NULL THEN 1 ELSE 0 END) FROM Orders
vs
SELECT SUM(CASE WHEN j.oid iS NULL THEN 1 ELSE 0 END) FROM
Orders o LEFT OUTER JOIN Junction j
WHERE o.o_id = j.o_id
You can come up with examples where the other makes cleaner, more efficient queries as well.
Upvotes: 1