Reputation: 15355
This is my simplified statement
SELECT ...
FROM tab1 AS i FULL OUTER JOIN tab2 AS d ON i.[Id]=d.[Id]
WHERE d.[Data]<>i.[Data] OR
(d.[Data] IS NULL AND i.[Data] IS NOT NULL) OR
(d.[Data] IS NOT NULL AND i.[Data] IS NULL)
I want to get all entries that are
So I don't want to see records were and i and d contain the same data or are both NULL.
My statement look so long and complicated. Is there an easier way?
Using ISNULL(d.[Data],'')<>ISNULL(i.[Data],'')
works for text, but not for DATE
or TIME(0)
columns.
My statement works for every type.
Upvotes: 3
Views: 577
Reputation: 71364
Yes you can, and you can get the optimizer to recognize it too.
Paul White has this little ditty:
WHERE NOT EXISTS (
SELECT d.[Data]
INTERSECT
SELECT i.[Data])
This works because of the semantics of INTERSECT
which deal with nulls. What this says is "are there no rows in the subquery made up of value B and value B", this will only be satisfied if they are different values or one is null and the other not. If both are nulls, there will be a row with a null.
If you check the XML query plan (not the graphical one in SSMS), you will see that it compiles all the way down to d.[Data] <> i.[Data]
, but the operator it uses will have CompareOp="IS"
and not EQ
.
See the full plan here.
The relevant part of the plan is:
<Predicate>
<ScalarOperator ScalarString="@t1.[i] as [t1].[i] = @t2.[i] as [t2].[i]">
<Compare CompareOp="IS">
<ScalarOperator>
<Identifier>
<ColumnReference Table="@t1" Alias="[t1]" Column="i" />
</Identifier>
</ScalarOperator>
<ScalarOperator>
<Identifier>
<ColumnReference Table="@t2" Alias="[t2]" Column="i" />
</Identifier>
</ScalarOperator>
</Compare>
</ScalarOperator>
</Predicate>
I find the optimizer works very well this way round, rather than doing EXISTS / EXCEPT
.
I urge you to vote for the Azure Feedback to implement a proper operator.
UPDATE FOR SQL SERVER 2022: This operator is now available as IS [NOT] DISTINCT FROM
.
Upvotes: 4
Reputation: 71364
A new option is now available in SQL Server 2022, which has now implemented the ISO-standard IS [NOT] DISTINCT FROM
syntax mentioned in the comments.
SELECT ...
FROM tab1 AS i
FULL OUTER JOIN tab2 AS d ON i.Id = d.Id
WHERE d.Data IS DISTINCT FROM i.Data;
This compiles into an IS
comparison underneath (the same as in my other answer using EXCEPT
), and is very efficient.
Upvotes: 3