Reputation: 47

How do I compare two structurally similar tables and only return the column name and value of differing values in SQL Server?

I'm using SQL Server 2008 (v10.0 SP3) as my database.

I am trying to find how to compare two nearly identical tables structure wise and only return the column name and value of columns that do not match.

I have two tables.

Table A has about 260 columns and a unique identifier for each record. It comes from a view on another server.

Table B is a copy of Table A in structure with the addition of an insert date column and an action column.

IF OBJECT_ID('tempdb..#TableA') IS NOT NULL
    DROP TABLE #TableA
IF OBJECT_ID('tempdb..#TableB') IS NOT NULL
    DROP TABLE #TableB

CREATE TABLE #TableA (
UniqueID INT,[Name] CHAR(3),[Address] CHAR(15),
HairColor CHAR(6),ImportDate DATETIME
)

CREATE TABLE #TableB (
UniqueID INT,[Name] CHAR(3),[Address] CHAR(15),
HairColor CHAR(6),ImportDate DATETIME,
AuditDate DATETIME,[Action] CHAR(10)
)

INSERT INTO #TableA
VALUES (1,'Joe','1 Main St.','Brown','12/1/2013')

INSERT INTO #TableA
VALUES (2,'Jen','1 Main St.','Red','12/1/2013')

INSERT INTO #TableB
VALUES (2,'Jen','1 Main St.','Blonde','10/1/2013','12/1/2013','CHANGE')

INSERT INTO #TableB
VALUES (2,'Jen','1 Baker St.','Blonde','4/1/2010','10/1/2013','CHANGE')

INSERT INTO #TableB
VALUES (2,'Jen','4 Deer Ave.','Black','6/1/2004','4/1/2010','CHANGE')

SELECT * FROM #TableA AS ta

SELECT * FROM #TableB AS tb

Table A

╔══════════╦══════╦════════════╦═══════════╦════════════╗
║ UniqueID ║ Name ║  Address   ║ HairColor ║ ImportDate ║
╠══════════╬══════╬════════════╬═══════════╬════════════╣
║        1 ║ Joe  ║ 1 Main St. ║ Brown     ║ 12/1/2013  ║
║        2 ║ Jen  ║ 1 Main St. ║ Red       ║ 12/1/2013  ║
╚══════════╩══════╩════════════╩═══════════╩════════════╝

Table B

╔══════════╦══════╦═════════════╦═══════════╦════════════╦═══════════╦════════╗
║ UniqueID ║ Name ║   Address   ║ HairColor ║ ImportDate ║ AuditDate ║ Action ║
╠══════════╬══════╬═════════════╬═══════════╬════════════╬═══════════╬════════╣
║        2 ║ Jen  ║ 1 Main St.  ║ Blonde    ║ 10/1/2013  ║ 12/1/2013 ║ CHANGE ║
║        2 ║ Jen  ║ 1 Baker St. ║ Blonde    ║ 4/1/2010   ║ 10/1/2013 ║ CHANGE ║
║        2 ║ Jen  ║ 4 Deer Ave. ║ Black     ║ 6/1/2004   ║ 4/1/2010  ║ CHANGE ║
╚══════════╩══════╩═════════════╩═══════════╩════════════╩═══════════╩════════╝

Table A is truncated and reloaded each month but before that is done, a comparison between the new data coming in to Table A (new A) and the data already in Table A (old A) takes place.

If new A does not contain a row that is in old A I insert the old A row into Table B with the insertdate and DELETED as the Action.

If new A contains a row that is not in old A I insert the new A row into Table B with the insertdate and ADDED as the Action.

If new A contains a row that has any column that doesn't match with old A I insert the old A row into Table B with the insertdate and CHANGE as the Action.

As you can see, Table B can have multiple rows with the same unique identifier as Table A because the source data for the unique record in Table A could change from month to month.

Now additions and deletions are easy to report. What I am having difficulty with is reporting changes. With 260 common rows I'd like to be able to select only the columns in Table B that do not match the related columns in Table A, left joined on TableA.UniqueID = TableB.UniqueID

I was thinking about making a table valued function with using the input UniqueID from Table A and outputting something like this:

╔══════════╦════════════╦═════════════╦════════════╦═══════════╦════════╗
║ UniqueID ║ ColumnName ║    Value    ║ ImportDate ║ AuditDate ║ Action ║
╠══════════╬════════════╬═════════════╬════════════╬═══════════╬════════╣
║        2 ║ HairColor  ║ Blonde      ║ 10/1/2013  ║ 12/1/2013 ║ CHANGE ║
║        2 ║ Address    ║ 1 Baker St. ║ 4/1/2010   ║ 10/1/2013 ║ CHANGE ║
║        2 ║ HairColor  ║ Blonde      ║ 4/1/2010   ║ 10/1/2013 ║ CHANGE ║
║        2 ║ Address    ║ 4 Deer Ave. ║ 6/1/2004   ║ 4/1/2010  ║ CHANGE ║
║        2 ║ HairColor  ║ Black       ║ 6/1/2004   ║ 4/1/2010  ║ CHANGE ║
╚══════════╩════════════╩═════════════╩════════════╩═══════════╩════════╝

I am just not sure how to do it.

This is where my brain stops working. Is it possible to use ORDINAL_POSITION in INFORMATION_SCHEMA.COLUMNS to my advantage?

SELECT a.ORDINAL_POSITION,a.COLUMN_NAME 
FROM INFORMATION_SCHEMA.COLUMNS AS a 
JOIN INFORMATION_SCHEMA.COLUMNS AS b ON a.COLUMN_NAME = b.COLUMN_NAME
WHERE a.TABLE_NAME = 'TableA' AND a.TABLE_SCHEMA='dbo'
AND b.TABLE_NAME = 'TableB' AND b.TABLE_SCHEMA='dbo'

Upvotes: 3

Answers (3)

Manny

Reputation: 977

Sounds like UNPIVOT is what you need. From MSDN: Using PIVOT and UNPIVOT

Here's a possible solution to your question:

SELECT UniqueID, ColumnName, Value, ImportDate, AuditDate, Action
FROM
(
    SELECT 
        TableA.UniqueID, 
        CAST((CASE WHEN ISNULL(TableB.Address, '') <> ISNULL(TableA.Address, '')
              THEN ISNULL(TableB.Address, '')
              ELSE NULL END) AS nvarchar(255))
        AS Address, 
        CAST((CASE WHEN ISNULL(TableB.HairColor, '') <> ISNULL(TableA.HairColor, '')
              THEN ISNULL(TableB.HairColor, '')
              ELSE NULL END) AS nvarchar(255))
        AS HairColor, 
        TableB.ImportDate,
        TableB.AuditDate,
        TableB.Action
    FROM TableA INNER JOIN TableB ON TableB.UniqueID = TableA.UniqueID
) AS p
UNPIVOT
(
    Value FOR ColumnName in (Address, HairColor)
) AS up
WHERE Value IS NOT NULL

Note that I cast the values to nvarchar(255) in order to resolve conflicts between types. You may need adjust that to something more appropriate.

Also, at a glance, I do think the generated results may need some refining based around the import and audit dates. But that's another topic. Also, I ended up using an INNER JOIN since a LEFT JOIN didn't make sense to me in this case. Maybe I'm missing something.

So, all that said, this should get the results that you're looking for.

Upvotes: 0

Shiva

Reputation: 20975

If you have Visual Studio Premium or Ultimate, you can use the Schema Compare Database Project for this purpose. It's pretty sophisticated and even lets you generate DDL for differences in source and/or target. You can compare data also if needed. I've used it in many projects in the past.

MSDN: Compare and Synchronize Database Schemas.

Upvotes: 0

twrowsell

Reputation: 467

You could try this for every column...

SELECT a.UniqueID,'HairColour' [ColumnName],b.HairColour [Value],...,'CHANGE' [Action] FROM Tableb b
INNER JOIN Tablea a ON
   a.UniqueID = b.uniqueID
WHERE a.HairColor <> b.HairColor

UNION

    SELECT a.UniqueID,'Address' [ColumnName],b.Address [Value],...,'CHANGE' [Action] FROM Tableb b
INNER JOIN Tablea a ON
   a.UniqueID = b.uniqueID
WHERE a.Address <> b.Address

etc...

Upvotes: 0

How do I compare two structurally similar tables and only return the column name and value of differing values in SQL Server?

Answers (3)

Related Questions