Reputation: 1708
I have a data structure, where I have to store pairs of elements. Each pair has exactly 2 values in it, so we are employing a table, with the fields(leftvalue, rightvalue....). These pairs should be unique, and they are considered the same, if the keys are changed.
Example: (Fruit, Apple) is the same as (Apple, Fruit).
If it is possible in an efficient way, I would put a database constraint on the fields, but not at any cost - performance is more important.
We are using MSSQL server 2008
currently, but an update is possible.
Is there an efficient way of achieving this?
Upvotes: 3
Views: 2787
Reputation: 35696
If you always store the values in order but store the direction in another column,
CREATE TABLE [Pairs]
(
[A] NVarChar(MAX) NOT NULL,
[B] NVarChar(MAX) NOT NULL,
[DirectionAB] Bit NOT NULL,
CONSTRAINT [PK_Pairs] PRIMARY KEY ([A],[B])
)
You can acheive exaclty what you want with one clustered index, and optimize your lookups too.
So when I insert the pair 'Apple', 'Fruit'
I'd do,
INSERT [Pairs] VALUES ('Apple', 'Friut', 1);
Nice and easy. Then I insert 'Fruit', 'Apple'
,
INSERT [Pairs] VALUES ('Apple', 'Fruit', 0); -- 0 becuase order is reversed.
The insert fails because this is a primary key violation. To further illustrate, the pair 'Coconuts', 'Bananas'
would be stored as
INSERT [Pairs] VALUES ('Bananas', 'Coconuts', 0);
For additional lookup performance, I'd add the index
CREATE NONCLUSTERED INDEX [IX_Pairs_Reverse] ON [Pairs] ([B], [A]);
If you can't control inserts to the table, it may be necessary to ensure that [A]
and [B]
are inserted correctly.
CONSTRAINT [CK_Pairs_ALessThanB] CHECK ([A] < [B])
But this may be an unnecessary performance hit, depending on how controlled your inserts are.
Upvotes: 2
Reputation: 2604
One way would be to create a computed column that combines the two values and put a unique constraint upon it:
create table #test (
a varchar(10) not null,
b varchar(10) not null,
both as case when a > b then a + ':' + b else b + ':' + a end persisted unique nonclustered
)
so
insert #test
select 'apple', 'fruit'
insert #test
select 'fruit', 'apple'
Gives
(1 row(s) affected)
Msg 2627, Level 14, State 1, Line 3
Violation of UNIQUE KEY constraint 'UQ__#test_____55252CB631EC6D26'. Cannot insert duplicate key in object 'dbo.#test'.
The statement has been terminated.
Upvotes: 0
Reputation: 239646
Two solutions, both really about changing the problem into an easier one. I'd usually prefer the T1
solution if forcing a change on consumers is acceptable:
create table dbo.T1 (
Lft int not null,
Rgt int not null,
constraint CK_T1 CHECK (Lft < Rgt),
constraint UQ_T1 UNIQUE (Lft,Rgt)
)
go
create table dbo.T2 (
Lft int not null,
Rgt int not null
)
go
create view dbo.T2_DRI
with schemabinding
as
select
CASE WHEN Lft<Rgt THEN Lft ELSE Rgt END as Lft,
CASE WHEN Lft<Rgt THEN Rgt ELSE Lft END as Rgt
from dbo.T2
go
create unique clustered index IX_T2_DRI on dbo.T2_DRI(Lft,Rgt)
go
In both cases, neither T1
nor T2
can contain duplicate values in the Lft,Rgt
pairs.
Upvotes: 5