w0051977
w0051977

Reputation: 15817

SQL - Null foreign key or junction table?

My question relates to 0 to many relationships. For example, if you have a customer table and an Order table and a customer can have one or many orders and an order can have zero or one customer. The reason it can have zero is because orders were imported from a legacy order system where customer details where not held.

I believe there are two options:

  1. Set the CustomerID (foreign ket in Orders table) to null for those orders without a customer.
  2. Create a junction table between customers and orders containing CustomerID and OrderID. However, I thought junction tables were specifically for many to many relationships

Which is the best alternative?

EDIT:

I have discovered the following post and agree with Molf's answer, which is option 2 in my original question: Any example of a necessary nullable foreign key?.

Upvotes: 1

Views: 1072

Answers (2)

Cade Roux
Cade Roux

Reputation: 89741

I would not create the junction table, unless you are initially going to restrict the order key to be unique and/or foresee the relationship becoming a many-to-many.

The question in the simpler two-table option with an ordinary single FK is whether you want to allow a NULL customer FK in the order rows. It is certainly possible to not allow this (for obvious good referential integrity reasons going forward) and have an "UNKNOWN CUSTOMER" customer which is used for all legacy orders with unknown customers. In a very good way, this replaces a NULL (which can be misinterpreted) with something explicit (the reference to an "UNKNOWN CUSTOMER", which can always be fixed once the real customer is created or identified).

I would strongly consider that, because one would hope that the legacy data would be a relatively smaller and smaller portion of your data going forward and you want to ensure as much RI in your data for the typical case (all cases going forward MUST identify a customer during order creation) while still accommodating the legacy cases (which diminish in significance in your data universe over time)

Upvotes: 3

dfb
dfb

Reputation: 13289

You have a redundancy here that is conceptually confusing, though you may have a requirement for this. If you did this from scratch, you could simply have Orders have a NULLable FK of Customer.CustomerID. This would satisfy the having 1 customer - many orders and 1/0 orders to a customer. You could also do it by having the junction table, since this is technically a many-to-many relationship.

The best alternative is going to be dependent on how you are querying the data. For example, doing analytical-type queries on your Orders table might be more convenient

Take finding orders with no customer

SELECT SUM(CASE WHEN customer iS NULL THEN 1 ELSE 0 END) FROM Orders

vs

SELECT SUM(CASE WHEN j.oid iS NULL THEN 1 ELSE 0 END) FROM 
Orders o  LEFT OUTER JOIN Junction j 
WHERE o.o_id = j.o_id

You can come up with examples where the other makes cleaner, more efficient queries as well.

Upvotes: 1

Related Questions