Jon Warren
Jon Warren

Reputation: 857

SQL query to find difference in text between different rows

I have a table that resembles the following:

                     INITIAL TABLE
-------------------------------------------------------

ID        Date           Author      Data1       Data9
==  ================   ==========   ======= ... =======
1   2017-06-04 16:47   John Smith     Foo         Bar
2   2017-06-04 16:50   John Smith     Goo         Bar
3   2017-06-04 16:52   John Smith     Hoo         Car
4   2017-06-04 16:55   Bill Plith     Foo         Bar
5   2017-06-04 16:59   John Smith     Foo         Car
6   2017-06-04 17:04   Bill Plith     Foo         Bar

The data comes from users changing a data set from a central source. The user's name, the datetime that the change was made, as well as a flush of all data fields' values are appended to the table whenever a change is made to any data field.

The problem is that it's tedious to manually check over each column to see where the actual change was made. Sometimes a change is made to one field, sometimes a change is made to multiple fields, and sometimes no changes are made at all (if the user clicked "Save Changes" with no changes actually made, a new row will still get added to the table).

What I want to do is query the table to produce results similar to the following:

                     CHANGE TABLE
--------------------------------------------------------

      Date           Author              Changes
================   ==========   ========================
2017-06-04 16:50   John Smith   Data1 was changed to Goo
2017-06-04 16:52   John Smith   Data1 was changed to Hoo
2017-06-04 16:52   John Smith   Data9 was changed to Car
2017-06-04 16:55   Bill Plith   Data1 was changed to Foo
2017-06-04 16:55   Bill Plith   Data9 was changed to Bar
2017-06-04 16:59   John Smith   Data9 was changed to Car
2017-06-04 17:04   Bill Plith   Data9 was changed to Bar

Some more things to note:

I think that's everything. Thank you in advance to anyone who can assist, I'm still in the process of learning SQL so if I left out anything pertinent please do leave a comment and I'll fill in the gaps!

Upvotes: 1

Views: 320

Answers (2)

Andrei Odegov
Andrei Odegov

Reputation: 3439

It's possible to use the UNPIVOT operator and the default parameter of LAG function.

with
  unp as(
    select *
    from @Changes
    unpivot (
      Data For DataNam in (Data1,Data2,Data3,Data4,Data5,Data6,Data7,Data8,Data9)
    ) u
  ),
  a as(
    select *,
      lag(Data,1,Data) over(partition by DataNam order by [Date]) PrevData
    from unp
  )
select [Date], Author,DataNam+' was changed to '+Data Changes
from a
where Data!=PrevData
order by DataNam,[Date];

This can be checked online.

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1270301

You can use a giant case expression with lag(). Assuming none of the values are ever NULL:

select c.date, c.author,
       stuff( ((case when data1 <> lag(data1) over (partition by date) then ', data1' else '' end) +
               (case when data2 <> lag(data2) over (partition by date) then ', data2' else '' end) +
               . . .
              ), 1, 2, '') as changes
from changes c;

This can be modified to handle NULL values, although that complicates the expressions a bit.

Upvotes: 1

Related Questions