Nuno G
Nuno G

Reputation: 2105

How to create a unique index on a NULL column?

I am using SQL Server 2005. I want to constrain the values in a column to be unique, while allowing NULLS.

My current solution involves a unique index on a view like so:

CREATE VIEW vw_unq WITH SCHEMABINDING AS
    SELECT Column1
      FROM MyTable
     WHERE Column1 IS NOT NULL

CREATE UNIQUE CLUSTERED INDEX unq_idx ON vw_unq (Column1)

Any better ideas?

Upvotes: 118

Views: 128242

Answers (5)

Phil Haselden
Phil Haselden

Reputation: 3006

Using SQL Server 2008, you can create a filtered index.

CREATE UNIQUE INDEX AK_MyTable_Column1 ON MyTable (Column1) WHERE Column1 IS NOT NULL

Another option is a trigger to check uniqueness, but this could affect performance.

Upvotes: 122

Martin Staufcik
Martin Staufcik

Reputation: 9490

It is possible to use filter predicates to specify which rows to include in the index.

From the documentation:

WHERE <filter_predicate> Creates a filtered index by specifying which rows to include in the index. The filtered index must be a nonclustered index on a table. Creates filtered statistics for the data rows in the filtered index.

Example:

CREATE TABLE Table1 (
  NullableCol int NULL
)

CREATE UNIQUE INDEX IX_Table1 ON Table1 (NullableCol) WHERE NullableCol IS NOT NULL;

Upvotes: 1

Castamos
Castamos

Reputation: 483

Strictly speaking, a unique nullable column (or set of columns) can be NULL (or a record of NULLs) only once, since having the same value (and this includes NULL) more than once obviously violates the unique constraint.

However, that doesn't mean the concept of "unique nullable columns" is valid; to actually implement it in any relational database we just have to bear in mind that this kind of databases are meant to be normalized to properly work, and normalization usually involves the addition of several (non-entity) extra tables to establish relationships between the entities.

Let's work a basic example considering only one "unique nullable column", it's easy to expand it to more such columns.

Suppose we the information represented by a table like this:

create table the_entity_incorrect
(
  id integer,
  uniqnull integer null, /* we want this to be "unique and nullable" */
  primary key (id)
);

We can do it by putting uniqnull apart and adding a second table to establish a relationship between uniqnull values and the_entity (rather than having uniqnull "inside" the_entity):

create table the_entity
(
  id integer,
  primary key(id)
);

create table the_relation
(
  the_entity_id integer not null,
  uniqnull integer not null,

  unique(the_entity_id),
  unique(uniqnull),
  /* primary key can be both or either of the_entity_id or uniqnull */
  primary key (the_entity_id, uniqnull), 
  foreign key (the_entity_id) references the_entity(id)
);

To associate a value of uniqnull to a row in the_entity we need to also add a row in the_relation.

For rows in the_entity were no uniqnull values are associated (i.e. for the ones we would put NULL in the_entity_incorrect) we simply do not add a row in the_relation.

Note that values for uniqnull will be unique for all the_relation, and also notice that for each value in the_entity there can be at most one value in the_relation, since the primary and foreign keys on it enforce this.

Then, if a value of 5 for uniqnull is to be associated with an the_entity id of 3, we need to:

start transaction;
insert into the_entity (id) values (3); 
insert into the_relation (the_entity_id, uniqnull) values (3, 5);
commit;

And, if an id value of 10 for the_entity has no uniqnull counterpart, we only do:

start transaction;
insert into the_entity (id) values (10); 
commit;

To denormalize this information and obtain the data a table like the_entity_incorrect would hold, we need to:

select
  id, uniqnull
from
  the_entity left outer join the_relation
on
  the_entity.id = the_relation.the_entity_id
;

The "left outer join" operator ensures all rows from the_entity will appear in the result, putting NULL in the uniqnull column when no matching columns are present in the_relation.

Remember, any effort spent for some days (or weeks or months) in designing a well normalized database (and the corresponding denormalizing views and procedures) will save you years (or decades) of pain and wasted resources.

Upvotes: -3

onedaywhen
onedaywhen

Reputation: 57033

The calculated column trick is widely known as a "nullbuster"; my notes credit Steve Kass:

CREATE TABLE dupNulls (
pk int identity(1,1) primary key,
X  int NULL,
nullbuster as (case when X is null then pk else 0 end),
CONSTRAINT dupNulls_uqX UNIQUE (X,nullbuster)
)

Upvotes: 76

willasaywhat
willasaywhat

Reputation: 2372

Pretty sure you can't do that, as it violates the purpose of uniques.

However, this person seems to have a decent work around: http://sqlservercodebook.blogspot.com/2008/04/multiple-null-values-in-unique-index-in.html

Upvotes: 25

Related Questions