Reputation:
I am trying to compare two tables, SQL Server, to verify some data. I want to return all the rows from both tables where data is either in one or the other. In essence, I want to show all the discrepancies. I need to check three pieces of data in doing so, FirstName, LastName and Product.
I'm fairly new to SQL and it seems like a lot of the solutions I'm finding are over complicating things. I don't have to worry about NULLs.
I started by trying something like this:
SELECT DISTINCT [First Name], [Last Name], [Product Name] FROM [Temp Test Data]
WHERE ([First Name] NOT IN (SELECT [First Name]
FROM [Real Data]))
I'm having trouble taking this further though.
Thanks!
EDIT:
Based on the answer by @treaschf I have been trying to use a variation of the following query:
SELECT td.[First Name], td.[Last Name], td.[Product Name]
FROM [Temp Test Data] td FULL OUTER JOIN [Data] AS d
ON td.[First Name] = d.[First Name] AND td.[Last Name] = d.[Last Name]
WHERE (d.[First Name] = NULL) AND (d.[Last Name] = NULL)
But I keep getting 0 results back, when I know that there is at least 1 row in td that is not in d.
EDIT:
Ok, I think I figured it out. At least in my few minutes of testing it seems to work good enough.
SELECT [First Name], [Last Name]
FROM [Temp Test Data] AS td
WHERE (NOT EXISTS
(SELECT [First Name], [Last Name]
FROM [Data] AS d
WHERE ([First Name] = td.[First Name]) OR ([Last Name] = td.[Last Name])))
This is basically going to tell me what is in my test data that is not in my real data. Which is completely fine for what I need to do.
Upvotes: 281
Views: 982502
Reputation: 21
I was inspired by Thato's comment on this answer where he linked to his dynamic solution. I found some great simplifications using techniques that weren't available 10 years ago when he created his version, so I want to share it here. His blog is having an error when I try to post it there, so this will have to do.
I found that using the JSON features of SQL Server to flatten all the columns and values very convenient. I was able to make it work with just one dynamic SQL statement that I use twice to get the flattened data from each table. All the rest of the work is native SQL within the sproc.
A few notable things:
insert-exec
feature so we can avoid the challenges of temp tables and dealing with dynamic columns in the sprocThe sproc:
create or alter proc test.table_compare (
@table1 varchar(max), --> first table to compare; use schema and brackets if needed
@table2 varchar(max), --> second table to compare; use schema and brackets if needed
@keycol varchar(max), --> join key (no brackets, expected same name in both tables)
@ignore varchar(max) = null --> optional columns to ignore (comma-separated, no brackets)
) as
/*
Created by Keith Townsend, 5/16/2024
Inspired by work done by Thato Mantai found at https://thitos.blogspot.com/2014/03/compare-data-from-two-tables.html
Published as answer on StackOverflow: https://stackoverflow.com/a/78490999/18155330
Compares all data between two tables for all columns that exist in both tables.
The key column is assumed to be named the same in both tables. It would be easy to change this to specify both key columns if that ever becomes necessary.
The casing of the key column can be different in each table as long as it's spelled the same. We do a lookup to get the right casing for each table.
Result sets:
1. counts of null vs. good keys, in case your data has null keys that you didn't know about
2. keys found in only one of the tables
3. summary of affected fields with counts
4. distinct differences with counts
5. all values that do not match
FUTURE: we could return all 3 sets of data as a single JSON result so the caller can receive and query it
*/
--> we need this unquoted for proper comparison to sys.columns
set @keycol = parsename(@keycol, 1)
--> we need case-sensitive key column name for the openjson statement in @sql, so a quick lookup will get the right casing for each table
declare @keycol1 varchar(max) = (select name from sys.columns where object_id = object_id(@table1) and name = @keycol)
declare @keycol2 varchar(max) = (select name from sys.columns where object_id = object_id(@table2) and name = @keycol)
--> get columns that are the same between the two tables; ignore any that aren't in both tables
declare @columns varchar(max) = (
select name = string_agg(quotename(name), ', ') within group (order by name)
from ( select a.name
from sys.columns a
inner join sys.columns b
on a.object_id = object_id(@table1)
and b.object_id = object_id(@table2)
and a.name = b.name
left join (select trim(value) from string_split(@ignore, ',')) ex(name)
on ex.name = a.name
where ex.name is null --> filter excluded columns
) x
)
--> json manipulation is an easy way to unpivot all the fields & values for all the rows
declare @sql varchar(max) = '
declare @json varchar(max) = (select ' + @columns + ' from {{tablename}} for json path)
select [{{keycol}}], [key], [value], [type], [table] = ''{{tablename}}''
from openjson(@json) with (
[{{keycol}}] varchar(max),
Row nvarchar(max) ''$'' as JSON
)
outer apply openjson(Row)
'
--> table to capture results of execute()
declare @values table (
[key] varchar(max),
[column] varchar(max),
[value] varchar(max),
[type] int,
[table] varchar(max)
)
--> get data from first table
declare @sql1 varchar(max) = replace(replace(@sql, '{{tablename}}', @table1), '{{keycol}}', @keycol1)
insert @values execute(@sql1)
--> get data from second table
declare @sql2 varchar(max) = replace(replace(@sql, '{{tablename}}', @table2), '{{keycol}}', @keycol2)
insert @values execute(@sql2)
--> show counts of null vs. good keys for each table
select [table],
[null key] = count(case when [key] is null and [column] = @keycol then 1 end),
[good key] = count(case when [key] is not null and [column] = @keycol then 1 end)
from @values
group by [table]
--> show keys of unmatched records for each table
select x.[table],
[unmatched] = count(*),
[keys not found in other table] = string_agg(x.[key], ', ') within group (order by x.[key])
from (select distinct [table], [key] from @values where [table] = @table1) tk1
full outer join (select distinct [table], [key] from @values where [table] = @table2) tk2
on tk1.[key] = tk2.[key]
outer apply (select
coalesce(tk1.[table], tk2.[table]),
coalesce(tk1.[key], tk2.[key]),
case when tk1.[table] is null or tk2.[table] is null then 1 else 0 end
) x([table],[key], [unmatched])
where 1 = x.unmatched
group by x.[table]
--> capture deltas
drop table if exists #deltas
select [Key] = coalesce(t1.[key], t2.[key]),
[Column] = coalesce(t1.[column], t2.[column]),
Table1 = t1.Value,
Table2 = t2.Value
into #deltas
from (select * from @values where [table] = @table1) t1
full outer join (select * from @values where [table] = @table2) t2
on t1.[key] = t2.[key]
and t1.[column] = t2.[column]
where t1.Value != t2.Value
--> show what fields are affected with counts
select [Column], [Differences] = count(*) from #deltas group by [Column] order by 2 desc
--> show distinct differences with counts
select [Column], Table1, Table2, Records = count(*) from #deltas group by [Column], Table1, Table2 order by 4 desc
--> show all differences
declare @keytype varchar(20) = (select type_name(system_type_id) from sys.columns where object_id = object_id(@table1) and name = @keycol)
select * from #deltas order by
--> each sort has to be a single type, so to conditionally cast you have to have separate sorts and return null for disabled sorts
case when @keytype = 'int' then cast([key] as int) end, --> if key type is int, order by int cast
case when @keytype = 'bigint' then cast([key] as bigint) end, --> if key type is bigint, order by bigint cast
case when @keytype not in ('int', 'bigint') then [key] end, --> if key type is not int, order without cast
[column]
GO
Upvotes: 1
Reputation: 63
SQL UNION and EXCEPT operations can be used to compare entire datasets between source and target tables for data completeness and correctness. For example:
(SELECT * FROM source_table)
UNION
(SELECT * FROM target_table)
EXCEPT
(SELECT * FROM source_table INTERSECT SELECT * FROM target_table);
You can also try UNION ALL if you want the actual number rows.
Upvotes: 0
Reputation: 17
I can't add comments because my reputation isn't high enough. That seems silly.
Replying to @Juan Velez in the top voted answer which gives the solution
( SELECT * FROM table1
EXCEPT
SELECT * FROM table2)
UNION ALL
( SELECT * FROM table2
EXCEPT
SELECT * FROM table1)
and Juan asked, "How do I see which tables had which rows"?
This part of the query selects all rows in table 1 that are not in table 2
( SELECT * FROM @table1
EXCEPT
SELECT * FROM @table2)
This part of the query selects all rows in table 2 that are not in table 1
( SELECT * FROM @table2
EXCEPT
SELECT * FROM @table1)
UNION ALL just puts them together so you can, just from one table, see if any of them is missing rows that the other has. If you want to see the individual differences try
INSERT INTO @table1NotTable2
SELECT * FROM @table1
EXCEPT
SELECT * FROM @table2
INSERT INTO @table2NotTable1
SELECT * FROM @table2
EXCEPT
SELECT * FROM @table1
And to achieve the same final result as his answer, use
SELECT * FROM @table1NotTable2
UNION ALL
SELECT * FROM @table2NotTable1
For people wanting this as a copy/paste template:
DECLARE @table1 TABLE (
)
DECLARE @table1NotTable2 TABLE (
)
DECLARE @table2 TABLE (
)
DECLARE @table2NotTable1 TABLE (
)
INSERT INTO @table1 (
)
INSERT INTO @table2 (
)
INSERT INTO @table1NotTable2
SELECT * FROM @table1
EXCEPT
SELECT * FROM @table2
INSERT INTO @table2NotTable1
SELECT * FROM @table2
EXCEPT
SELECT * FROM @table1
SELECT * FROM @table1NotTable2
SELECT * FROM @table2NotTable1
SELECT * FROM @table1NotTable2 UNION ALL SELECT * FROM @table2NotTable1
Upvotes: 0
Reputation: 43
This ought to work just fine. A select query to find difference between row counts of two different tables
select (t1 - t2) as count_diff from (select (select count(*) from tbl1) t1, (select count(*) from tbl2) t2)
Upvotes: 0
Reputation: 764
EXCEPT
and NOT EXISTS
are good ways to quickly find differences between sets of data, but I often want to know exactly which columns are different and how they are different.
What I find useful in these situations is using UNPIVOT
to convert columns into key value pairs that one can compare row for row.
In order to be useful though, there needs to be a way to match the records you want to compare, in your example something like a social security number or a "person id":
CREATE TABLE [RealData]
(
[PersonId] INT,
[FirstName] NVARCHAR(100),
[LastName] NVARCHAR(100),
[ProductName] NVARCHAR(100)
)
CREATE TABLE [TempTestData]
(
[PersonId] INT,
[FirstName] NVARCHAR(100),
[LastName] NVARCHAR(100),
[ProductName] NVARCHAR(100)
)
INSERT INTO [RealData] ([PersonId], [FirstName], [LastName], [ProductName])
VALUES
(1, 'Al', 'Bundy', 'Ladies Size 12'),
(2, 'Peggy', 'Bundy', 'TV Guide')
INSERT INTO [TempTestData] ([PersonId], [FirstName], [LastName], [ProductName])
VALUES
(1, 'Al', 'Bundy', 'Ladies Size 13'),
(2, 'Peggy', 'Bundy', 'TV Guide')
And the UNPIVOT
with a couple of CTEs thrown in:
;WITH RealDataCte AS (
SELECT
'Real Data' AS [DataSource],
unpivotedRealData.*
FROM
(SELECT
CAST([PersonId] AS NVARCHAR(100)) AS [PersonId],
[FirstName],
[LastName],
[ProductName]
FROM [RealData]) AS realData
UNPIVOT
(ColumnValue FOR ColumnName IN ([FirstName], [LastName], [ProductName])) AS unpivotedRealData
),
TempTestDataCte AS (
SELECT
'Temp Test Data' AS [DataSource],
unpivotedDempTestData.*
FROM
(SELECT
CAST([PersonId] AS NVARCHAR(100)) AS [PersonId],
[FirstName],
[LastName],
[ProductName]
FROM [TempTestData]) AS tempTestData
UNPIVOT
(ColumnValue FOR ColumnName IN ([FirstName], [LastName], [ProductName])) AS unpivotedDempTestData
)
SELECT
RealDataCte.[DataSource],
RealDataCte.[ColumnName],
RealDataCte.[ColumnValue],
TempTestDataCte.[DataSource],
TempTestDataCte.[ColumnName],
TempTestDataCte.[ColumnValue],
CASE WHEN RealDataCte.[ColumnValue] <> TempTestDataCte.[ColumnValue] THEN 'YES' ELSE 'NO' END AS ColumnsDiffer
FROM RealDataCte
INNER JOIN
TempTestDataCte
ON RealDataCte.[ColumnName] = TempTestDataCte.[ColumnName]
AND RealDataCte.[PersonId] = TempTestDataCte.[PersonId]
WHERE
RealDataCte.[ColumnValue] <> TempTestDataCte.[ColumnValue]
The result - the 2 products are different:
This is great if you have many, many columns to compare over a large number of rows.
It can take a while to setup, though, and you do need to convert every column value to the same type, which can take a while (notice the CAST
for PersonId).
If the dataset is really large you may also want to use temp tables instead of CTEs.
Upvotes: 1
Reputation: 1372
There is a performance issue related with the left join as well as full join with large data.
In my opinion this is the best solution:
select [First Name], count(1) e from
(select * from [Temp Test Data]
union all
select * from [Temp Test Data 2]) a
group by [First Name] having e = 1
Upvotes: 1
Reputation: 2094
You can use except , for example something like this :
-- DB1..Tb1 have values than DB2..Tb1 not have
Select Col1,Col2,Col3 From DB1..Tb1
except
Select Col1,Col2,Col3 From DB2..Tb1
-- Now we change order
-- DB2..Tb1 have values than DB1..Tb1 not have
Select Col1,Col2,Col3 From DB2..Tb1
except
Select Col1,Col2,Col3 From DB1..Tb1
Upvotes: 4
Reputation: 5654
Presenting the Cadillac of Diffs as an SP. See within for the basic template that was based on answer by @erikkallen. It supports
exec Common.usp_DiffTableRows '#t1', '#t2';
exec Common.usp_DiffTableRows
@pTable0 = 'ydb.ysh.table1',
@pTable1 = 'xdb.xsh.table2',
@pOrderByCsvOpt = null, -- Order the results
@pOnlyCsvOpt = null, -- Only compare these columns
@pIgnoreCsvOpt = null; -- Ignore these columns (ignored if @pOnlyCsvOpt is specified)
alter proc [Common].[usp_DiffTableRows]
@pTable0 varchar(300),
@pTable1 varchar(300),
@pOrderByCsvOpt nvarchar(1000) = null, -- Order the Results
@pOnlyCsvOpt nvarchar(4000) = null, -- Only compare these columns
@pIgnoreCsvOpt nvarchar(4000) = null, -- Ignore these columns (ignored if @pOnlyCsvOpt is specified)
@pDebug bit = 0
as
/*---------------------------------------------------------------------------------------------------------------------
Purpose: Compare rows between two tables.
Usage: exec Common.usp_DiffTableRows '#a', '#b';
Modified By Description
---------- ---------- -------------------------------------------------------------------------------------------
2015.10.06 crokusek Initial Version
2019.03.13 crokusek Added @pOrderByCsvOpt
2019.06.26 crokusek Support for @pIgnoreCsvOpt, @pOnlyCsvOpt.
2019.09.04 crokusek Minor debugging improvement
2020.03.12 crokusek Detect duplicate rows in either source table
---------------------------------------------------------------------------------------------------------------------*/
begin try
if (substring(@pTable0, 1, 1) = '#')
set @pTable0 = 'tempdb..' + @pTable0; -- object_id test below needs full names for temp tables
if (substring(@pTable1, 1, 1) = '#')
set @pTable1 = 'tempdb..' + @pTable1; -- object_id test below needs full names for temp tables
if (object_id(@pTable0) is null)
raiserror('Table name is not recognized: ''%s''', 16, 1, @pTable0);
if (object_id(@pTable1) is null)
raiserror('Table name is not recognized: ''%s''', 16, 1, @pTable1);
create table #ColumnGathering
(
Name nvarchar(300) not null,
Sequence int not null,
TableArg tinyint not null
);
declare
@usp varchar(100) = object_name(@@procid),
@sql nvarchar(4000),
@sqlTemplate nvarchar(4000) =
'
use $database$;
insert into #ColumnGathering
select Name, column_id as Sequence, $TableArg$ as TableArg
from sys.columns c
where object_id = object_id(''$table$'', ''U'')
';
set @sql = replace(replace(replace(@sqlTemplate,
'$TableArg$', 0),
'$database$', (select DatabaseName from Common.ufn_SplitDbIdentifier(@pTable0))),
'$table$', @pTable0);
if (@pDebug = 1)
print 'Sql #CG 0: ' + @sql;
exec sp_executesql @sql;
set @sql = replace(replace(replace(@sqlTemplate,
'$TableArg$', 1),
'$database$', (select DatabaseName from Common.ufn_SplitDbIdentifier(@pTable1))),
'$table$', @pTable1);
if (@pDebug = 1)
print 'Sql #CG 1: ' + @sql;
exec sp_executesql @sql;
if (@pDebug = 1)
select * from #ColumnGathering;
select Name,
min(Sequence) as Sequence,
convert(bit, iif(min(TableArg) = 0, 1, 0)) as InTable0,
convert(bit, iif(max(TableArg) = 1, 1, 0)) as InTable1
into #Columns
from #ColumnGathering
group by Name
having ( @pOnlyCsvOpt is not null
and Name in (select Value from Common.ufn_UsvToNVarcharKeyTable(@pOnlyCsvOpt, default)))
or
( @pOnlyCsvOpt is null
and @pIgnoreCsvOpt is not null
and Name not in (select Value from Common.ufn_UsvToNVarcharKeyTable(@pIgnoreCsvOpt, default)))
or
( @pOnlyCsvOpt is null
and @pIgnoreCsvOpt is null)
if (exists (select 1 from #Columns where InTable0 = 0 or InTable1 = 0))
begin
select 1; -- without this the debugging info doesn't stream sometimes
select * from #Columns order by Sequence;
waitfor delay '00:00:02'; -- give results chance to stream before raising exception
raiserror('Columns are not equal between tables, consider using args @pIgnoreCsvOpt, @pOnlyCsvOpt. See Result Sets for details.', 16, 1);
end
if (@pDebug = 1)
select * from #Columns order by Sequence;
declare
@columns nvarchar(4000) = --iif(@pOnlyCsvOpt is null and @pIgnoreCsvOpt is null,
-- '*',
(
select substring((select ',' + ac.name
from #Columns ac
order by Sequence
for xml path('')),2,200000) as csv
);
if (@pDebug = 1)
begin
print 'Columns: ' + @columns;
waitfor delay '00:00:02'; -- give results chance to stream before possibly raising exception
end
-- Based on https://stackoverflow.com/a/2077929/538763
-- - Added sensing for duplicate rows
-- - Added reporting of source table location
--
set @sqlTemplate = '
with
a as (select ~, Row_Number() over (partition by ~ order by (select null)) -1 as Duplicates from $a$),
b as (select ~, Row_Number() over (partition by ~ order by (select null)) -1 as Duplicates from $b$)
select 0 as SourceTable, ~
from
(
select * from a
except
select * from b
) anb
union all
select 1 as SourceTable, ~
from
(
select * from b
except
select * from a
) bna
order by $orderBy$
';
set @sql = replace(replace(replace(replace(@sqlTemplate,
'$a$', @pTable0),
'$b$', @pTable1),
'~', @columns),
'$orderBy$', coalesce(@pOrderByCsvOpt, @columns + ', SourceTable')
);
if (@pDebug = 1)
print 'Sql: ' + @sql;
exec sp_executesql @sql;
end try
begin catch
declare
@CatchingUsp varchar(100) = object_name(@@procid);
if (xact_state() = -1)
rollback;
-- Disabled for S.O. post
--exec Common.usp_Log
--@pMethod = @CatchingUsp;
--exec Common.usp_RethrowError
--@pCatchingMethod = @CatchingUsp;
throw;
end catch
go
create function Common.Trim
(
@pOriginalString nvarchar(max),
@pCharsToTrim nvarchar(50) = null -- specify null or 'default' for whitespae
)
returns table
with schemabinding
as
/*--------------------------------------------------------------------------------------------------
Purpose: Trim the specified characters from a string.
Modified By Description
---------- -------------- --------------------------------------------------------------------
2012.09.25 S.Rutszy/crok Modified from https://dba.stackexchange.com/a/133044/9415
--------------------------------------------------------------------------------------------------*/
return
with cte AS
(
select patindex(N'%[^' + EffCharsToTrim + N']%', @pOriginalString) AS [FirstChar],
patindex(N'%[^' + EffCharsToTrim + N']%', reverse(@pOriginalString)) AS [LastChar],
len(@pOriginalString + N'~') - 1 AS [ActualLength]
from
(
select EffCharsToTrim = coalesce(@pCharsToTrim, nchar(0x09) + nchar(0x20) + nchar(0x0d) + nchar(0x0a))
) c
)
select substring(@pOriginalString, [FirstChar],
((cte.[ActualLength] - [LastChar]) - [FirstChar] + 2)
) AS [TrimmedString]
--
--cte.[ActualLength],
--[FirstChar],
--((cte.[ActualLength] - [LastChar]) + 1) AS [LastChar]
from cte;
go
create function [Common].[ufn_UsvToNVarcharKeyTable] (
@pCsvList nvarchar(MAX),
@pSeparator nvarchar(1) = ',' -- can pass keyword 'default' when calling using ()'s
)
--
-- SQL Server 2012 distinguishes nvarchar keys up to maximum of 450 in length (900 bytes)
--
returns @tbl table (Value nvarchar(450) not null primary key(Value)) as
/*-------------------------------------------------------------------------------------------------
Purpose: Converts a comma separated list of strings into a sql NVarchar table. From
http://www.programmingado.net/a-398/SQL-Server-parsing-CSV-into-table.aspx
This may be called from RunSelectQuery:
GRANT SELECT ON Common.ufn_UsvToNVarcharTable TO MachCloudDynamicSql;
Modified By Description
---------- -------------- -------------------------------------------------------------------
2011.07.13 internet Initial version
2011.11.22 crokusek Support nvarchar strings and a custom separator.
2017.12.06 crokusek Trim leading and trailing whitespace from each element.
2019.01.26 crokusek Remove newlines
-------------------------------------------------------------------------------------------------*/
begin
declare
@pos int,
@textpos int,
@chunklen smallint,
@str nvarchar(4000),
@tmpstr nvarchar(4000),
@leftover nvarchar(4000),
@csvList nvarchar(max) = iif(@pSeparator not in (char(13), char(10), char(13) + char(10)),
replace(replace(@pCsvList, char(13), ''), char(10), ''),
@pCsvList); -- remove newlines
set @textpos = 1
set @leftover = ''
while @textpos <= len(@csvList)
begin
set @chunklen = 4000 - len(@leftover)
set @tmpstr = ltrim(@leftover + substring(@csvList, @textpos, @chunklen))
set @textpos = @textpos + @chunklen
set @pos = charindex(@pSeparator, @tmpstr)
while @pos > 0
begin
set @str = substring(@tmpstr, 1, @pos - 1)
set @str = (select TrimmedString from Common.Trim(@str, default));
insert @tbl (value) values(@str);
set @tmpstr = ltrim(substring(@tmpstr, @pos + 1, len(@tmpstr)))
set @pos = charindex(@pSeparator, @tmpstr)
end
set @leftover = @tmpstr
end
-- Handle @leftover
set @str = (select TrimmedString from Common.Trim(@leftover, default));
if @str <> ''
insert @tbl (value) values(@str);
return
end
GO
create function Common.ufn_SplitDbIdentifier(@pIdentifier nvarchar(300))
returns @table table
(
InstanceName nvarchar(300) not null,
DatabaseName nvarchar(300) not null,
SchemaName nvarchar(300),
BaseName nvarchar(300) not null,
FullTempDbBaseName nvarchar(300), -- non-null for tempdb (e.g. #Abc____...)
InstanceWasSpecified bit not null,
DatabaseWasSpecified bit not null,
SchemaWasSpecified bit not null,
IsCurrentInstance bit not null,
IsCurrentDatabase bit not null,
IsTempDb bit not null,
OrgIdentifier nvarchar(300) not null
) as
/*-----------------------------------------------------------------------------------------------------------
Purpose: Split a Sql Server Identifier into its parts, providing appropriate default values and
handling temp table (tempdb) references.
Example: select * from Common.ufn_SplitDbIdentifier('t')
union all
select * from Common.ufn_SplitDbIdentifier('s.t')
union all
select * from Common.ufn_SplitDbIdentifier('d.s.t')
union all
select * from Common.ufn_SplitDbIdentifier('i.d.s.t')
union all
select * from Common.ufn_SplitDbIdentifier('#d')
union all
select * from Common.ufn_SplitDbIdentifier('tempdb..#d');
-- Empty
select * from Common.ufn_SplitDbIdentifier('illegal name');
Modified By Description
---------- -------------- -----------------------------------------------------------------------------
2013.09.27 crokusek Initial version.
-----------------------------------------------------------------------------------------------------------*/
begin
declare
@name nvarchar(300) = ltrim(rtrim(@pIdentifier));
-- Return an empty table as a "throw"
--
--Removed for SO post
--if (Common.ufn_IsSpacelessLiteralIdentifier(@name) = 0)
-- return;
-- Find dots starting from the right by reversing first.
declare
@revName nvarchar(300) = reverse(@name);
declare
@firstDot int = charindex('.', @revName);
declare
@secondDot int = iif(@firstDot = 0, 0, charindex('.', @revName, @firstDot + 1));
declare
@thirdDot int = iif(@secondDot = 0, 0, charindex('.', @revName, @secondDot + 1));
declare
@fourthDot int = iif(@thirdDot = 0, 0, charindex('.', @revName, @thirdDot + 1));
--select @firstDot, @secondDot, @thirdDot, @fourthDot, len(@name);
-- Undo the reverse() (first dot is first from the right).
--
set @firstDot = iif(@firstDot = 0, 0, len(@name) - @firstDot + 1);
set @secondDot = iif(@secondDot = 0, 0, len(@name) - @secondDot + 1);
set @thirdDot = iif(@thirdDot = 0, 0, len(@name) - @thirdDot + 1);
set @fourthDot = iif(@fourthDot = 0, 0, len(@name) - @fourthDot + 1);
--select @firstDot, @secondDot, @thirdDot, @fourthDot, len(@name);
declare
@baseName nvarchar(300) = substring(@name, @firstDot + 1, len(@name) - @firstdot);
declare
@schemaName nvarchar(300) = iif(@firstDot - @secondDot - 1 <= 0,
null,
substring(@name, @secondDot + 1, @firstDot - @secondDot - 1));
declare
@dbName nvarchar(300) = iif(@secondDot - @thirdDot - 1 <= 0,
null,
substring(@name, @thirdDot + 1, @secondDot - @thirdDot - 1));
declare
@instName nvarchar(300) = iif(@thirdDot - @fourthDot - 1 <= 0,
null,
substring(@name, @fourthDot + 1, @thirdDot - @fourthDot - 1));
with input as (
select
coalesce(@instName, '[' + @@servername + ']') as InstanceName,
coalesce(@dbName, iif(left(@baseName, 1) = '#', 'tempdb', db_name())) as DatabaseName,
coalesce(@schemaName, iif(left(@baseName, 1) = '#', 'dbo', schema_name())) as SchemaName,
@baseName as BaseName,
iif(left(@baseName, 1) = '#',
(
select [name] from tempdb.sys.objects
where object_id = object_id('tempdb..' + @baseName)
),
null) as FullTempDbBaseName,
iif(@instName is null, 0, 1) InstanceWasSpecified,
iif(@dbName is null, 0, 1) DatabaseWasSpecified,
iif(@schemaName is null, 0, 1) SchemaWasSpecified
)
insert into @table
select i.InstanceName, i.DatabaseName, i.SchemaName, i.BaseName, i.FullTempDbBaseName,
i.InstanceWasSpecified, i.DatabaseWasSpecified, i.SchemaWasSpecified,
iif(i.InstanceName = '[' + @@servername + ']', 1, 0) as IsCurrentInstance,
iif(i.DatabaseName = db_name(), 1, 0) as IsCurrentDatabase,
iif(left(@baseName, 1) = '#', 1, 0) as IsTempDb,
@name as OrgIdentifier
from input i;
return;
end
GO
Upvotes: 5
Reputation: 14920
Simple variation on @erikkallen answer that shows which table the row is present in:
( SELECT 'table1' as source, * FROM table1
EXCEPT
SELECT * FROM table2)
UNION ALL
( SELECT 'table2' as source, * FROM table2
EXCEPT
SELECT * FROM table1)
If you get an error
All queries combined using a UNION, INTERSECT or EXCEPT operator must have an equal number of expressions in their target lists.
then it may help to add
( SELECT 'table1' as source, * FROM table1
EXCEPT
SELECT 'table1' as source, * FROM table2)
UNION ALL
( SELECT 'table2' as source, * FROM table2
EXCEPT
SELECT 'table2' as source, * FROM table1)
Upvotes: 18
Reputation: 2175
To get all the differences between two tables, you can use like me this SQL request :
SELECT 'TABLE1-ONLY' AS SRC, T1.*
FROM (
SELECT * FROM Table1
EXCEPT
SELECT * FROM Table2
) AS T1
UNION ALL
SELECT 'TABLE2-ONLY' AS SRC, T2.*
FROM (
SELECT * FROM Table2
EXCEPT
SELECT * FROM Table1
) AS T2
;
Upvotes: 24
Reputation: 29
This will do the trick, similar with Tiago's solution, return "source" table as well.
select [First name], [Last name], max(_tabloc) as _tabloc
from (
select [First Name], [Last name], 't1' as _tabloc from table1
union all
select [First name], [Last name], 't2' as _tabloc from table2
) v
group by [Fist Name], [Last name]
having count(1)=1
result will contain differences between tables, in column _tabloc you will have table reference.
Upvotes: 2
Reputation: 59
For a simple smoke test where you you're trying to ensure two tables match w/out worrying about column names:
--ensure tables have matching records
Select count (*) from tbl_A
Select count (*) from tbl_B
--create temp table of all records in both tables
Select * into #demo from tbl_A
Union All
Select * from tbl_B
--Distinct #demo records = Total #demo records/2 = Total tbl_A records = total tbl_B records
Select distinct * from #demo
You can easily write a store procedure to compare a batch of tables.
Upvotes: 1
Reputation: 117561
If you want to get which column values are different, you could use Entity-Attribute-Value model:
declare @Data1 xml, @Data2 xml
select @Data1 =
(
select *
from (select * from Test1 except select * from Test2) as a
for xml raw('Data')
)
select @Data2 =
(
select *
from (select * from Test2 except select * from Test1) as a
for xml raw('Data')
)
;with CTE1 as (
select
T.C.value('../@ID', 'bigint') as ID,
T.C.value('local-name(.)', 'nvarchar(128)') as Name,
T.C.value('.', 'nvarchar(max)') as Value
from @Data1.nodes('Data/@*') as T(C)
), CTE2 as (
select
T.C.value('../@ID', 'bigint') as ID,
T.C.value('local-name(.)', 'nvarchar(128)') as Name,
T.C.value('.', 'nvarchar(max)') as Value
from @Data2.nodes('Data/@*') as T(C)
)
select
isnull(C1.ID, C2.ID) as ID, isnull(C1.Name, C2.Name) as Name, C1.Value as Value1, C2.Value as Value2
from CTE1 as C1
full outer join CTE2 as C2 on C2.ID = C1.ID and C2.Name = C1.Name
where
not
(
C1.Value is null and C2.Value is null or
C1.Value is not null and C2.Value is not null and C1.Value = C2.Value
)
Upvotes: 8
Reputation: 1026
I know that this may not be a popular answer but I do agree with @Randy Minder on using third party tool when more complex comparison is needed.
This specific case here is easy and for this case such tools are not needed but this can get complex easily if you introduce more columns, databases on two servers, more complex comparison criteria and such.
There are a lot of these tools such as ApexSQL Data Diff or Quest Toad and you can always use them in trial mode to get the job done.
Upvotes: 39
Reputation: 34421
( SELECT * FROM table1
EXCEPT
SELECT * FROM table2)
UNION ALL
( SELECT * FROM table2
EXCEPT
SELECT * FROM table1)
Upvotes: 347
Reputation: 1720
Try this :
SELECT
[First Name], [Last Name]
FROM
[Temp Test Data] AS td EXCEPTION JOIN [Data] AS d ON
(d.[First Name] = td.[First Name] OR d.[Last Name] = td.[Last Name])
Much simpler to read.
Upvotes: 2
Reputation: 5944
IF you have tables A
and B
, both with colum C
, here are the records, which are present in table A
but not in B
:
SELECT A.*
FROM A
LEFT JOIN B ON (A.C = B.C)
WHERE B.C IS NULL
To get all the differences with a single query, a full join must be used, like this:
SELECT A.*, B.*
FROM A
FULL JOIN B ON (A.C = B.C)
WHERE A.C IS NULL OR B.C IS NULL
What you need to know in this case is, that when a record can be found in A
, but not in B
, than the columns which come from B
will be NULL, and similarly for those, which are present in B
and not in A
, the columns from A
will be null.
Upvotes: 291