Reputation: 47
I'm using SQL Server 2008 (v10.0 SP3) as my database.
I am trying to find how to compare two nearly identical tables structure wise and only return the column name and value of columns that do not match.
I have two tables.
Table A has about 260 columns and a unique identifier for each record. It comes from a view on another server.
Table B is a copy of Table A in structure with the addition of an insert date column and an action column.
IF OBJECT_ID('tempdb..#TableA') IS NOT NULL
DROP TABLE #TableA
IF OBJECT_ID('tempdb..#TableB') IS NOT NULL
DROP TABLE #TableB
CREATE TABLE #TableA (
UniqueID INT,[Name] CHAR(3),[Address] CHAR(15),
HairColor CHAR(6),ImportDate DATETIME
)
CREATE TABLE #TableB (
UniqueID INT,[Name] CHAR(3),[Address] CHAR(15),
HairColor CHAR(6),ImportDate DATETIME,
AuditDate DATETIME,[Action] CHAR(10)
)
INSERT INTO #TableA
VALUES (1,'Joe','1 Main St.','Brown','12/1/2013')
INSERT INTO #TableA
VALUES (2,'Jen','1 Main St.','Red','12/1/2013')
INSERT INTO #TableB
VALUES (2,'Jen','1 Main St.','Blonde','10/1/2013','12/1/2013','CHANGE')
INSERT INTO #TableB
VALUES (2,'Jen','1 Baker St.','Blonde','4/1/2010','10/1/2013','CHANGE')
INSERT INTO #TableB
VALUES (2,'Jen','4 Deer Ave.','Black','6/1/2004','4/1/2010','CHANGE')
SELECT * FROM #TableA AS ta
SELECT * FROM #TableB AS tb
Table A
╔══════════╦══════╦════════════╦═══════════╦════════════╗
║ UniqueID ║ Name ║ Address ║ HairColor ║ ImportDate ║
╠══════════╬══════╬════════════╬═══════════╬════════════╣
║ 1 ║ Joe ║ 1 Main St. ║ Brown ║ 12/1/2013 ║
║ 2 ║ Jen ║ 1 Main St. ║ Red ║ 12/1/2013 ║
╚══════════╩══════╩════════════╩═══════════╩════════════╝
Table B
╔══════════╦══════╦═════════════╦═══════════╦════════════╦═══════════╦════════╗
║ UniqueID ║ Name ║ Address ║ HairColor ║ ImportDate ║ AuditDate ║ Action ║
╠══════════╬══════╬═════════════╬═══════════╬════════════╬═══════════╬════════╣
║ 2 ║ Jen ║ 1 Main St. ║ Blonde ║ 10/1/2013 ║ 12/1/2013 ║ CHANGE ║
║ 2 ║ Jen ║ 1 Baker St. ║ Blonde ║ 4/1/2010 ║ 10/1/2013 ║ CHANGE ║
║ 2 ║ Jen ║ 4 Deer Ave. ║ Black ║ 6/1/2004 ║ 4/1/2010 ║ CHANGE ║
╚══════════╩══════╩═════════════╩═══════════╩════════════╩═══════════╩════════╝
Table A is truncated and reloaded each month but before that is done, a comparison between the new data coming in to Table A (new A) and the data already in Table A (old A) takes place.
If new A does not contain a row that is in old A I insert the old A row into Table B with the insertdate
and DELETED
as the Action.
If new A contains a row that is not in old A I insert the new A row into Table B with the insertdate
and ADDED
as the Action.
If new A contains a row that has any column that doesn't match with old A I insert the old A row into Table B with the insertdate
and CHANGE
as the Action.
As you can see, Table B can have multiple rows with the same unique identifier as Table A because the source data for the unique record in Table A could change from month to month.
Now additions and deletions are easy to report. What I am having difficulty with is reporting changes. With 260 common rows I'd like to be able to select only the columns in Table B that do not match the related columns in Table A, left joined on TableA.UniqueID = TableB.UniqueID
I was thinking about making a table valued function with using the input UniqueID
from Table A and outputting something like this:
╔══════════╦════════════╦═════════════╦════════════╦═══════════╦════════╗
║ UniqueID ║ ColumnName ║ Value ║ ImportDate ║ AuditDate ║ Action ║
╠══════════╬════════════╬═════════════╬════════════╬═══════════╬════════╣
║ 2 ║ HairColor ║ Blonde ║ 10/1/2013 ║ 12/1/2013 ║ CHANGE ║
║ 2 ║ Address ║ 1 Baker St. ║ 4/1/2010 ║ 10/1/2013 ║ CHANGE ║
║ 2 ║ HairColor ║ Blonde ║ 4/1/2010 ║ 10/1/2013 ║ CHANGE ║
║ 2 ║ Address ║ 4 Deer Ave. ║ 6/1/2004 ║ 4/1/2010 ║ CHANGE ║
║ 2 ║ HairColor ║ Black ║ 6/1/2004 ║ 4/1/2010 ║ CHANGE ║
╚══════════╩════════════╩═════════════╩════════════╩═══════════╩════════╝
I am just not sure how to do it.
This is where my brain stops working. Is it possible to use ORDINAL_POSITION in INFORMATION_SCHEMA.COLUMNS to my advantage?
SELECT a.ORDINAL_POSITION,a.COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS AS a
JOIN INFORMATION_SCHEMA.COLUMNS AS b ON a.COLUMN_NAME = b.COLUMN_NAME
WHERE a.TABLE_NAME = 'TableA' AND a.TABLE_SCHEMA='dbo'
AND b.TABLE_NAME = 'TableB' AND b.TABLE_SCHEMA='dbo'
Upvotes: 3
Views: 233
Reputation: 977
Sounds like UNPIVOT
is what you need. From MSDN: Using PIVOT and UNPIVOT
Here's a possible solution to your question:
SELECT UniqueID, ColumnName, Value, ImportDate, AuditDate, Action
FROM
(
SELECT
TableA.UniqueID,
CAST((CASE WHEN ISNULL(TableB.Address, '') <> ISNULL(TableA.Address, '')
THEN ISNULL(TableB.Address, '')
ELSE NULL END) AS nvarchar(255))
AS Address,
CAST((CASE WHEN ISNULL(TableB.HairColor, '') <> ISNULL(TableA.HairColor, '')
THEN ISNULL(TableB.HairColor, '')
ELSE NULL END) AS nvarchar(255))
AS HairColor,
TableB.ImportDate,
TableB.AuditDate,
TableB.Action
FROM TableA INNER JOIN TableB ON TableB.UniqueID = TableA.UniqueID
) AS p
UNPIVOT
(
Value FOR ColumnName in (Address, HairColor)
) AS up
WHERE Value IS NOT NULL
Note that I cast the values to nvarchar(255)
in order to resolve conflicts between types. You may need adjust that to something more appropriate.
Also, at a glance, I do think the generated results may need some refining based around the import and audit dates. But that's another topic. Also, I ended up using an INNER JOIN
since a LEFT JOIN
didn't make sense to me in this case. Maybe I'm missing something.
So, all that said, this should get the results that you're looking for.
Upvotes: 0
Reputation: 20975
If you have Visual Studio Premium or Ultimate, you can use the Schema Compare Database Project for this purpose. It's pretty sophisticated and even lets you generate DDL for differences in source and/or target. You can compare data also if needed. I've used it in many projects in the past.
MSDN: Compare and Synchronize Database Schemas.
Upvotes: 0
Reputation: 467
You could try this for every column...
SELECT a.UniqueID,'HairColour' [ColumnName],b.HairColour [Value],...,'CHANGE' [Action] FROM Tableb b
INNER JOIN Tablea a ON
a.UniqueID = b.uniqueID
WHERE a.HairColor <> b.HairColor
UNION
SELECT a.UniqueID,'Address' [ColumnName],b.Address [Value],...,'CHANGE' [Action] FROM Tableb b
INNER JOIN Tablea a ON
a.UniqueID = b.uniqueID
WHERE a.Address <> b.Address
etc...
Upvotes: 0