Reputation: 363

Compare two rows and identify columns whose values are different

The Situation

We have an application where we store machine settings in a SQL table. When the user changes a parameter of the machine, we create a "revision", that means we insert a row into a table. This table has about 200 columns. In our application, the user can take a look on each revision.

The Problem

We want to highlight the parameters that have changed since the last revision.

The Question

Is there an SQL-only way to get the column names of the differences between two rows?

An Example

ID | p_x | p_y | p_z
--------------------
11 | xxx | yyy | zzz

12 | xxy | yyy | zzy

The query should return p_x and p_z.

EDIT

The table has 200 columns, not rows...

MY WAY OUT

My intention was to find a "one-line-SQL-statement" for this problem.

I see in the answers below, it's kind a bigger thing in SQL. As there is no short, SQL-included solution for this problem, solving it in the backend of our software (c#) is of course much easier!

But as this is not a real "answer" to my question, I don't mark it as answered.

Thanks for the help.

Upvotes: 15

Answers (6)

Tapeo

Reputation: 21

You can convert both rows to json, and call a function to compare the two json

DECLARE @json1 NVARCHAR(Max)
DECLARE @json2 NVARCHAR(Max)

set @json1 = ( 
SELECT *
  FROM tbl
 where id=11
FOR JSON PATH, ROOT('data')
)

set @json2 = ( 
SELECT *
  FROM tbl
 where id=12
FOR JSON PATH, ROOT('data')
)


select * from  dbo.compare_jsonobject (  @json1, @json2)
where SideIndicator <> '=='

json function can be found here

https://www.red-gate.com/simple-talk/blogs/sql-server-json-diff-checking-for-differences-between-json-documents/

Upvotes: 0

CDeuce

Reputation: 1

Anna L's solution needs one minor revision. The WHERE statement does not correctly return changed values

where c1.Value <> c2.Value or not (c1.Value is null and c2.Value is null)

If c1.Value = 1 and c2.Value = 1, it will return true because they are both not null even though they are the same value. Here are two alternative options.

Option 1 - Checking if equal within a case statement and reversing it

WHERE (CASE WHEN c1.Value = c2.Value OR (c1.Value IS NULL AND c2.Value IS NULL) 
       THEN 0 ELSE 1 END) = 1  
        THEN 1 ELSE 0 END

Option 2 - Checking if one is null and the other is not null

WHERE c1.Value <> c2.Value 
OR (c1.Value IS NOT NULL AND c2.Value IS NULL) 
OR (c1.Value IS NULL AND c2.Value IS NOT NULL)

On another note, I found this answer helpful. You can now check differences using json.

Upvotes: 0

Weihui Guo

Reputation: 3997

You can use unpivot and pivot. The key is to transpose data so that you can use where [11] != [12].

WITH CTE AS (
    SELECT * 
    FROM 
    (
        SELECT ID, colName, val
        FROM tblName
        UNPIVOT
        (
            val
            FOR colName IN ([p_x],[p_y],[p_z])
        ) unpiv
    ) src
    PIVOT
    (
        MAX(val)
        FOR ID IN ([11], [12])
    ) piv
)
SELECT colName
--SELECT *
FROM CTE WHERE [11] != [12]

If there are only a few columns in the table, it's easy to simply put [p_x],[p_y],[p_z], but obviously it's not convenient to type 50 or more columns. Even though you may use this trick to drag and drop, or copy/paste, the column names from the table, it's still bulky. And for that, you may use the SELECT * EXCEPT strategy with dynamic sql.

DECLARE @TSQL NVARCHAR(MAX), @colNames NVARCHAR(MAX)
SELECT @colNames = COALESCE(@colNames + ',' ,'') + [name] 
FROM syscolumns WHERE name  <> 'ID' and id = (SELECT id FROM sysobjects WHERE name = 'tablelName')

SET @TSQL = '
    WITH CTE AS (
        SELECT * 
        FROM 
        (
            SELECT ID, colName, val
            FROM tablelName
            UNPIVOT
            (
                val
                FOR colName IN (' + @colNames + ')
            ) unpiv
        ) src
        PIVOT
        (
            MAX(val)
            FOR ID IN ([11], [12])
        ) piv
    )
    --SELECT colName
    SELECT *
    FROM CTE WHERE [11] != [12]
'
EXEC sp_executesql @TSQL

Upvotes: 1

Ann L.

Reputation: 13965

You say:

 We want to highlight the parameters that have changed since the last revision.

This implies that you want the display (or report) to make the parameters that changed stand out.

If you're going to show all the parameters anyway, it would be a lot easier to do this programmatically in the front end. It would be a much simpler problem in a programming language. Unfortunately, not knowing what your front end is, I can't give you particular recommendations.

If you really can't do it in the front end but have to receive this information in a query from the database (you did say "SQL-only"), you need to specify the format you'd like the data in. A single-column list of the columns that changed between the two records? A list of columns with a flag indicating which columns did or didn't change?

But here's one way that would work, though in the process it converts all your fields to nvarchars before it does its comparison:

Use the technique described here (disclaimer: that's my blog) to transform your records into ID-name-value pairs.

Join the resulting data set to itself on ID, so that you can compare the values and print those that have changed:

 with A as (    
--  We're going to return the product ID, plus an XML version of the     
--  entire record. 
select  ID    
 ,   (
      Select  *          
      from    myTable          
      where   ID = pp.ID                            
      for xml auto, type) as X 
from    myTable pp )
, B as (    
--  We're going to run an Xml query against the XML field, and transform it    
--  into a series of name-value pairs.  But X2 will still be a single XML    
--  field, associated with this ID.    
select  Id        
   ,   X.query(         
       'for $f in myTable/@*          
       return         
       <data  name="{ local-name($f) }" value="{ data($f) }" />      
       ') 
       as X2 from A 
)
,    C as (    
 --  We're going to run the Nodes function against the X2 field,  splitting     
 --  our list of "data" elements into individual nodes.  We will then use    
 -- the Value function to extract the name and value.   
 select B.ID as ID  
   ,   norm.data.value('@name', 'nvarchar(max)') as Name  
   ,   norm.data.value('@value', 'nvarchar(max)') as Value
from B cross apply B.X2.nodes('/myTable') as norm(data))

-- Select our results.

select *
from ( select * from C where ID = 123) C1
full outer join ( select * from C where ID = 345) C2
    on C1.Name = c2.Name
where c1.Value <> c2.Value 
  or  not (c1.Value is null and c2.Value is null)

Upvotes: 6

Dudi Konfino

Reputation: 1136

for sql server 2012 you can do something like that (duplicate it for each column):

  SELECT iif((p_x != lead(p_x) over(ORDER BY p_x)),
                                       (SELECT COLUMN_NAME 
                                        FROM INFORMATION_SCHEMA.COLUMNS
                                        WHERE TABLE_NAME = 'tbl' 
                                        AND 
                                        TABLE_SCHEMA='schema' 
                                        AND 
                                        ORDINAL_POSITION='1')
                                                 ,NULL)
FROM tbl

for sql server 2008 try

DECLARE @x int =11  -- first id
WHILE @x!=(SELECT count(1) FROM tbl)
BEGIN --comparison of two adjacent rows
if (SELECT p_x  FROM tbl WHERE id=@x)!=(SELECT p_x  FROM tbl WHERE id=@x+1)

 BEGIN
 SELECT COLUMN_NAME
 FROM INFORMATION_SCHEMA.COLUMNS
 WHERE TABLE_NAME = 'tbl'  --insert your table
 AND 
 TABLE_SCHEMA='schema'   --insert your schema
 AND 
 ORDINAL_POSITION='1'  --first column 'p_x'
END
set @x=@x+1
END

Upvotes: 0

Code Different

Reputation: 93161

Here's one way using UNPIVOT:

;WITH
    cte AS
    (
        SELECT      CASE WHEN t1.p_x <> t2.p_x THEN 1 ELSE 0 END As p_x,
                    CASE WHEN t1.p_y <> t2.p_y THEN 1 ELSE 0 END As p_y,
                    CASE WHEN t1.p_z <> t2.p_z THEN 1 ELSE 0 END As p_z
        FROM        MyTable t1, MyTable t2
        WHERE       t1.ID = 11 AND t2.ID = 12 -- enter the two revisions to compare here
    )

SELECT      *
FROM        cte
UNPIVOT     (
                Changed FOR ColumnName IN (p_x, p_y, p_z)
            ) upvt
WHERE       upvt.Changed = 1

You have to add code to handle NULLs during the comparisons. You can also build the query dynamically if there are lots of columns in your table.

Upvotes: 0

Compare two rows and identify columns whose values are different

Answers (6)

Related Questions