Reputation: 679
I'm using Microsoft SQL Server 2017 and was curious about how to constrain a specific relationship. I'm having a bit of trouble articulating so I'd prefer to share through an example.
Consider the following hypothetical database.
Customers
+---------------+
| Id | Name |
+---------------+
| 1 | Sam |
| 2 | Jane |
+---------------+
Addresses
+----------------------------------------+
| Id | CustomerId | Address |
+----------------------------------------+
| 1 | 1 | 105 Easy St |
| 2 | 1 | 9 Gale Blvd |
| 3 | 2 | 717 Fourth Ave |
+------+--------------+------------------+
Orders
+-----------------------------------+
| Id | CustomerId | AddressId |
+-----------------------------------+
| 1 | 1 | 1 |
| 2 | 2 | 3 |
| 3 | 1 | 3 | <--- Invalid Customer/Address Pair
+-----------------------------------+
Notice that the final Order
links a customer to an address that isn't theirs. I'm looking for a way to prevent this.
(You may ask why I need the CustomerId
in the Orders
table at all. To be clear, I recognize that the Address
already offers me the same information without the possibility of invalid pairs. However, I'd prefer to have an Order
flattened such that I don't have to channel through an address to retrieve a customer.)
From the related reading I was able to find, it seems that one method may be to enable a CHECK
constraint targeting a User-Defined Function. This User-Defined Function would be something like the following:
WHERE EXISTS (SELECT 1 FROM Addresses WHERE Id = Order.AddressId AND CustomerId = Order.CustomerId)
While I imagine this would work, given the somewhat "generality" of the articles I was able to find, I don't feel entirely confident that this is my best option.
An alternative might be to remove the CustomerId
column from the Addresses
table entirely, and instead add another table with Id
, CustomerId
, AddressId
. The Order
would then reference this Id
instead. Again, I don't love the idea of having to channel through an auxiliary table to get a Customer
or Address
.
Is there a cleaner way to do this? Or am I simply going about this all wrong?
Upvotes: 1
Views: 52
Reputation: 11486
Good question, however at the root it seems you are struggling with creating a foreign key constraint to something that is not a foreign key:
Orders.CustomerId -> Addresses.CustomerId
There is no simple built-in way to do this because it is normally not done. In ideal RDBMS practices you should strive to encapsulate data of specific types in their own tables only. In other words, try to avoid redundant data.
In the example case above the address ownership is redundant in both the address table and the orders table, because of this it is requiring additional checks to keep them synchronized. This can easily get out of hand with bigger datasets.
You mentioned:
However, I'd prefer to have an Order flattened such that I don't have to channel through an address to retrieve a customer.
But that is why a relational database is relational. It does this so that distinct data can be kept distinct and referenced with relative IDs.
I think the best solution would be to simply drop this requirement.
In other words, just go with:
Customers
+---------------+
| Id | Name |
+---------------+
| 1 | Sam |
| 2 | Jane |
+---------------+
Addresses
+----------------------------------------+
| Id | CustomerId | Address |
+----------------------------------------+
| 1 | 1 | 105 Easy St |
| 2 | 1 | 9 Gale Blvd |
| 3 | 2 | 717 Fourth Ave |
+------+--------------+------------------+
Orders
+--------------------+
| Id | AddressId |
+--------------------+
| 1 | 1 |
| 2 | 3 |
| 3 | 3 | <--- Valid Order/Address Pair
+--------------------+
With that said, to accomplish your purpose exactly, you do have views available for this kind of thing:
create view CustomerOrders
as
select o.Id OrderId,
a.CustomerId,
o.AddressId
from Orders
join Addresses a on a.Id = o.AddressId
I know this is a pretty trivial use-case for a view but I wanted to put in a plug for it because they are often neglected and come in handy with organizing big data sets. Using WITH SCHEMABINDING
they can also be indexed for performance.
Upvotes: 1
Reputation: 37472
You may ask why I need the
CustomerId
in theOrders
table at all. To be clear, I recognize that theAddress
already offers me the same information without the possibility of invalid pairs. However, I'd prefer to have anOrder
flattened such that I don't have to channel through an address to retrieve a customer.
If you face performance problems, the first thing is to create or amend proper indexes. And DBMS are usually good at join operations (with proper indexes). But yes normalization can sometimes help in performance tuning. But it should be a last resort. And if that route is taken, one should really know what one is doing and be very careful not to damage more at the end of a day, that one has gained. I have doubts, that you're out of options here and really need to go that path. You're likely barking up the wrong tree. Therefore I recommend you take the "normal", "sane" way and just drop customerid
in orders
and create proper indexes.
But if you really insist, you can try to make (id, customerid)
a key in addresses
(with a unique constraint) and then create a foreign key based on that.
ALTER TABLE addresses
ADD UNIQUE (id,
customerid);
ALTER TABLE orders
ADD FOREIGN KEY (addressid,
customerid)
REFERENCES addresses
(id,
customerid);
Upvotes: 1