srmark
srmark

Reputation: 8152

How to compare values which may both be null in T-SQL

I want to make sure I'm not inserting a duplicate row into my table (e.g. only primary key different). All my fields allow NULLS as I've decided null to mean "all values". Because of nulls, the following statement in my stored procedure can't work:

IF EXISTS(SELECT * FROM MY_TABLE WHERE 
    MY_FIELD1 = @IN_MY_FIELD1  AND
    MY_FIELD2 = @IN_MY_FIELD2  AND
    MY_FIELD3 = @IN_MY_FIELD3  AND 
    MY_FIELD4 = @IN_MY_FIELD4  AND
    MY_FIELD5 = @IN_MY_FIELD5  AND
    MY_FIELD6 = @IN_MY_FIELD6)
    BEGIN
        goto on_duplicate
    END

since NULL = NULL is not true.

How can I check for the duplicates without having an IF IS NULL statement for every column?

Upvotes: 94

Views: 136457

Answers (15)

William Edmondson
William Edmondson

Reputation: 3637

You will have to use IS NULL or ISNULL. There really isn't a way around it.

Upvotes: -1

Martin Smith
Martin Smith

Reputation: 452947

From SQL Server 2022 the "null safe" version of

WHERE 
    MY_FIELD1 = @IN_MY_FIELD1  AND
    MY_FIELD2 = @IN_MY_FIELD2  AND
    MY_FIELD3 = @IN_MY_FIELD3  AND 
    MY_FIELD4 = @IN_MY_FIELD4  AND
    MY_FIELD5 = @IN_MY_FIELD5  AND
    MY_FIELD6 = @IN_MY_FIELD6

would be

WHERE 
    MY_FIELD1 IS NOT DISTINCT FROM @IN_MY_FIELD1  AND
    MY_FIELD2 IS NOT DISTINCT FROM @IN_MY_FIELD2  AND
    MY_FIELD3 IS NOT DISTINCT FROM @IN_MY_FIELD3  AND 
    MY_FIELD4 IS NOT DISTINCT FROM @IN_MY_FIELD4  AND
    MY_FIELD5 IS NOT DISTINCT FROM @IN_MY_FIELD5  AND
    MY_FIELD6 IS NOT DISTINCT FROM @IN_MY_FIELD6

IS [NOT] DISTINCT FROM (Transact-SQL)

Upvotes: 4

user3763117
user3763117

Reputation: 325

NULLIF(TARGET.relation_id, SOURCE.app_relation_id) IS NULL Simple solution

Upvotes: -1

Pxtl
Pxtl

Reputation: 955

Above, @drowa shows a verbose approach that I agree with. It's good because it avoids the 3-value logic problem. Many of the other approaches provided here will fail in subtle and unexpected ways when negated because they're treating null as equivalent to false which it is not.

However, I have a workflow that I find makes it convenient-ish, here's a regex. Given code of the form

(leftSide <=> rightSide)

regex find this:

\(([a-zA-Z0-9_.@]+)\s*<=>\s*([a-zA-Z0-9_.@]+)\)

and replace with this:

(/*$1 <=> $2*/ ($1 IS NULL AND $2 IS NULL) OR ($1 IS NOT NULL AND $2 IS NOT NULL AND $1 = $2))

So I write the (leftSide <=> rightSide) code and apply the above regex transformation to get the expanded form. It'd be nicer if MSSQL offered some kind of macro expansion so I wouldn't have to do it manually, but it doesn't.

@Drowa's answer quoted for reference:

Equals comparison:

((f1 IS NULL AND f2 IS NULL) OR (f1 IS NOT NULL AND f2 IS NOT NULL AND f1 = f2))

Not Equal To comparison: Just negate the Equals comparison above.

NOT ((f1 IS NULL AND f2 IS NULL) OR (f1 IS NOT NULL AND f2 IS NOT NULL AND f1 = f2))

Is it verbose? Yes, it is. However it's efficient since it doesn't call any function. The idea is to use short circuit in predicates to make sure the equal operator (=) is used only with non-null values, otherwise null would propagate up in the expression tree.

Upvotes: 1

drowa
drowa

Reputation: 732

Equals comparison:

((f1 IS NULL AND f2 IS NULL) OR (f1 IS NOT NULL AND f2 IS NOT NULL AND f1 = f2))

Not Equal To comparison: Just negate the Equals comparison above.

NOT ((f1 IS NULL AND f2 IS NULL) OR (f1 IS NOT NULL AND f2 IS NOT NULL AND f1 = f2))

Is it verbose? Yes, it is. However it's efficient since it doesn't call any function. The idea is to use short circuit in predicates to make sure the equal operator (=) is used only with non-null values, otherwise null would propagate up in the expression tree.

Upvotes: 12

drowa
drowa

Reputation: 732

You could use SET ANSI_NULLS in order to specify the behavior of the Equals (=) and Not Equal To (<>) comparison operators when they are used with null values.

Upvotes: 2

WileCau
WileCau

Reputation: 2187

I needed a similar comparison when doing a MERGE:

WHEN MATCHED AND (Target.Field1 <> Source.Field1 OR ...)

The additional checks are to avoid updating rows where all the columns are already the same. For my purposes I wanted NULL <> anyValue to be True, and NULL <> NULL to be False.

The solution evolved as follows:

First attempt:

WHEN MATCHED AND
(
    (
        -- Neither is null, values are not equal
        Target.Field1 IS NOT NULL
            AND Source.Field1 IS NOT NULL
            AND Target.Field1 <> Source.Field1
    )
    OR
    (
        -- Target is null but source is not
        Target.Field1 IS NULL
            AND Source.Field1 IS NOT NULL
    )
    OR
    (
        -- Source is null but target is not
        Target.Field1 IS NOT NULL
            AND Source.Field1 IS NULL
    )

    -- OR ... Repeat for other columns
)

Second attempt:

WHEN MATCHED AND
(
    -- Neither is null, values are not equal
    NOT (Target.Field1 IS NULL OR Source.Field1 IS NULL)
        AND Target.Field1 <> Source.Field1

    -- Source xor target is null
    OR (Target.Field1 IS NULL OR Source.Field1 IS NULL)
        AND NOT (Target.Field1 IS NULL AND Source.Field1 IS NULL)

    -- OR ... Repeat for other columns
)

Third attempt (inspired by @THEn's answer):

WHEN MATCHED AND
(

    ISNULL(
        NULLIF(Target.Field1, Source.Field1),
        NULLIF(Source.Field1, Target.Field1)
    ) IS NOT NULL

    -- OR ... Repeat for other columns
)

The same ISNULL/NULLIF logic can be used to test equality and inequality:

  • Equality: ISNULL(NULLIF(A, B), NULLIF(B, A)) IS NULL
  • Inequaltiy: ISNULL(NULLIF(A, B), NULLIF(B, A)) IS NOT NULL

Here is an SQL-Fiddle demonstrating how it works http://sqlfiddle.com/#!3/471d60/1

Upvotes: 50

Jamie G
Jamie G

Reputation: 81

What if you want to do a comparison for values that ARE NOT equal? Just using a "NOT" in front of the previously mentioned comparisons does not work. The best I could come up with is:

(Field1 <> Field2) OR (NULLIF(Field1, Field2) IS NOT NULL) OR (NULLIF(Field2, Field1) IS NOT NULL)

Upvotes: 7

Graeme Job
Graeme Job

Reputation: 1149

Along the same lines as @Eric's answer, but without using a 'NULL' symbol.

(Field1 = Field2) OR (ISNULL(Field1, Field2) IS NULL)

This will be true only if both values are non-NULL, and equal each other, or both values are NULL

Upvotes: 108

THEn
THEn

Reputation: 1938

Did you check NULLIF? http://msdn.microsoft.com/en-us/library/ms177562.aspx

Upvotes: -3

Shannon Severance
Shannon Severance

Reputation: 18410

IF EXISTS(SELECT * FROM MY_TABLE WHERE 
            (MY_FIELD1 = @IN_MY_FIELD1 
                     or (MY_FIELD1 IS NULL and @IN_MY_FIELD1 is NULL))  AND
            (MY_FIELD2 = @IN_MY_FIELD2 
                     or (MY_FIELD2 IS NULL and @IN_MY_FIELD2 is NULL))  AND
            (MY_FIELD3 = @IN_MY_FIELD3 
                     or (MY_FIELD3 IS NULL and @IN_MY_FIELD3 is NULL))  AND
            (MY_FIELD4 = @IN_MY_FIELD4 
                     or (MY_FIELD4 IS NULL and @IN_MY_FIELD4 is NULL))  AND
            (MY_FIELD5 = @IN_MY_FIELD5 
                     or (MY_FIELD5 IS NULL and @IN_MY_FIELD5 is NULL))  AND
            (MY_FIELD6 = @IN_MY_FIELD6
                     or (MY_FIELD6 IS NULL and @IN_MY_FIELD6 is NULL)))
            BEGIN
                    goto on_duplicate
            END

Wordy As compared to the IFNULL/COALESCE solution. But will work without having to think about what value will not appear in the data that can be used as the stand in for NULL.

Upvotes: 21

Quassnoi
Quassnoi

Reputation: 425251

Use INTERSECT operator.

It's NULL-sensitive and efficient if you have a composite index on all your fields:

IF      EXISTS
        (
        SELECT  MY_FIELD1, MY_FIELD2, MY_FIELD3, MY_FIELD4, MY_FIELD5, MY_FIELD6
        FROM    MY_TABLE
        INTERSECT
        SELECT  @IN_MY_FIELD1, @IN_MY_FIELD2, @IN_MY_FIELD3, @IN_MY_FIELD4, @IN_MY_FIELD5, @IN_MY_FIELD6
        )
BEGIN
        goto on_duplicate
END

Note that if you create a UNIQUE index on your fields, your life will be much simpler.

Upvotes: 59

Eric
Eric

Reputation: 95093

Use ISNULL:

ISNULL(MY_FIELD1, 'NULL') = ISNULL(@IN_MY_FIELD1, 'NULL')

You can change 'NULL' to something like 'All Values' if it makes more sense to do so.

It should be noted that with two arguments, ISNULL works the same as COALESCE, which you could use if you have a few values to test (i.e.-COALESCE(@IN_MY_FIELD1, @OtherVal, 'NULL')). COALESCE also returns after the first non-null, which means it's (marginally) faster if you expect MY_FIELD1 to be blank. However, I find ISNULL much more readable, so that's why I used it, here.

Upvotes: 35

butterchicken
butterchicken

Reputation: 13883

You could coalesce each value, but it's a bit wince-inducing:

    IF EXISTS(SELECT * FROM MY_TABLE WHERE 
    coalesce(MY_FIELD1,'MF1') = coalesce(@IN_MY_FIELD1,'MF1')  AND
    ...
    BEGIN
            goto on_duplicate
    END

You'd also need to ensure that the coalesced value is not an otherwise valid value on the column in question. For example, if it was possible that the value of MY_FIELD1 could be 'MF1' then this would cause a lot of spurious hits.

Upvotes: 11

Remus Rusanu
Remus Rusanu

Reputation: 294177

You create a primary key on your fields and let the engine enforce the uniqueness. Doing IF EXISTS logic is incorrect anyway as is flawed with race conditions.

Upvotes: 6

Related Questions