Ash
Ash

Reputation: 1298

Show records of changed columns only in a SQL table

In this project, every table has a "history table", which looks the same as the original, but has these columns added: [HistId],[ActionUser],[ActionCode],[ActionDate],[ValidUntil] Basically on every insert, update, delete, we make something like a log. Afterwards, it is easy for us to see what the user did, who did what, and when.

See how it looks, here

The issue is the way the data history is displayed. I see that it was basically used a select all, and everything is displayed like an excel spreadsheet. It is very hard to find what was changed and what not. Mostly the changes are made one column at a time. And when you are trying to find something specific, it is hard to eyeball the info that you need.

I was thinking to use a different approach. I would like to display only what was changed in something that looks like a log. (I would also make this searchable later on)

The way it would look like:

Record with title "a" inserted by albert on 2010-01-01

Record's Title was updated from "a" to "b" by john on 2010-01-02

Record's Title was updated from "b" to "c" by dave on 2010-01-03

Record's Description was updated from "abc" to "def" by paul on 2010-01-04

...etc...

Record was deleted by bin on 2010-01-08

For the change that was on the 6th (title and description) I was thinking to have either 1 record, either 2 records, whichever is less complicated. So basically I am looking for a way, to display ONLY CHANGED data from a set of records, nothing else

Extra info:

  1. The History table-s structure can not be changed, because there are a lot of tables like this.

  2. I would like to only change the way we retrieve the data.

  3. there are tables with 20+ columns

  4. There are tables with 30 million records, so the performance is important, but usually there will not be too much data displayed, I would say 50 records max)

  5. Having both "from" and "to" in every line, looks nice, but if it is too complex, then the "to" is enough

Upvotes: 1

Views: 762

Answers (2)

KumarHarsh
KumarHarsh

Reputation: 5094

Assumption i) History table won't be change for whatever reason.

ii) Since you will display 50,100 ,150 record at a time.so million record is not a problem.

Try this and let me know,

create table #side ([Id] [int] IDENTITY NOT NULL,
                [Title] [varchar](50) NOT NULL,
                [Description] [varchar](250) NULL)

create table #h ([HistId] [int] IDENTITY(1,1) NOT NULL,
                [Id] [int] NOT NULL,
                [Title] [varchar](50) NOT NULL,
                [Description] [varchar](250) NULL,
                [TypeId] [int] NULL,
                [ActionUser] [int] NULL,
                [ActionCode] [char](1) NOT NULL,
                [ActionDate] [datetime] NOT NULL,
                [ValidUntil] [datetime] NULL)

insert into #side ([Title],[Description]) values ('a','abc')

insert into #h ([Id],[Title],[Description],[TypeId],[ActionUser],[ActionCode],[ActionDate],[ValidUntil]) values (1,'a','abc',123,991,'i','01/01/2010',NULL) 

declare @mod datetime;
set @mod = '01/02/2010'

insert into #h ([Id],[Title],[Description],[TypeId],[ActionUser],[ActionCode],[ActionDate],[ValidUntil]) values (1,'b','abc',123,991,'u',@mod,NULL) 


insert into #h ([Id],[Title],[Description],[TypeId],[ActionUser],[ActionCode],[ActionDate],[ValidUntil]) values (1,'c','abc',123,991,'u',@mod,NULL) 

insert into #h ([Id],[Title],[Description],[TypeId],[ActionUser],[ActionCode],[ActionDate],[ValidUntil]) values (1,'c','def',123,991,'u',@mod,NULL) 

insert into #h ([Id],[Title],[Description],[TypeId],[ActionUser],[ActionCode],[ActionDate],[ValidUntil]) values (1,'d','pqr',123,991,'u',@mod,NULL) 
select * from #h order by HistId Desc
--select * from #side

select h.HistId, case when h.ActionCode='i' THEN
'Record with Title "'+h.title+'" inserted '
when h.ActionCode='u' THEN
'Records '+ case 
when h.title<>h1.title and h.[Description]<>h1.[Description]  then  'Title,Description was updated from 
"'+h1.title+'","'+h1.[Description]+'" to " '+h.title+' "  , "'+h.[Description]+'" respectively'
when h.title<>h1.title then  'Title was updated from "'+h1.title+'" to " '+h.title+' " '
when h.[Description]<>h1.[Description] then  'Description was updated from "'+h1.[Description]+'" to " '+h.[Description]+' " '
 else '' end
 when h.ActionCode='d' THEN
 'Record was deleted'
 else
null
END
+'by '+cast(h.ActionUser as varchar)+' on '+convert(varchar(10),h.actiondate ,120)+''
from #h h
left join #h h1
on h.HistId=(h1.HistId+1)
--left join #usertable u
--on u.userid=h.[ActionUser]

drop table #h
drop table #side

Upvotes: 1

David Rushton
David Rushton

Reputation: 5030

This is a simple approach, that uses a self-join to compare each record against the previous entry. This technique assumes that HistId is an unbroken sequence of integers.

Sample Data

/* Sample data.
 * Three records are created.   
 * The first is a base line.  
 * The next two are updates.
 * Update one contains one change.
 * Update two contains two changes.
 *
 * See also: http://meta.stackoverflow.com/questions/333952/why-should-i-provide-an-mcve-for-what-seems-to-me-to-be-a-very-simple-sql-query
 */ 
DECLARE @SampleHistory TABLE 
(
    HistoryId       INT IDENTITY(1, 1),
    Title           VARCHAR(255),
    [Description]   VARCHAR(255)
)


INSERT INTO @SampleHistory
(
    Title,
    [Description]
)
VALUES  
    ('Data Warehouse Toolkit', 'First Edition'),            -- Base record.
    ('Data Warehouse Toolkit', 'Second Edition'),           -- Description changed.
    ('The Data Warehouse Toolkit', 'Third Edition')         -- Name and description changed.
;

Query

/* Change detection, using self join.
 * See also:  https://msdn.microsoft.com/en-us/library/ms177490.aspx
 */
SELECT
    CASE
        WHEN h2.Title IS NULL THEN 'New Title: ' + h1.Title + '.  '
        WHEN h1.Title <> h2.Title THEN 'Title updated from ' + h2.Title + ' to ' + h1.Title + '.  '
        ELSE ''
    END
    +
    CASE
        WHEN h2.[Description] IS NULL THEN 'New Description: ' + h1.[Description] + '.  '
        WHEN h1.[Description] <> h2.Title THEN 'Description updated from ' + h2.[Description] + ' to ' + h1.[Description] + '.  '
        ELSE ''
    END
FROM
    @SampleHistory AS h1
        LEFT OUTER JOIN @SampleHistory AS h2            ON h2.HistoryId = (h1.HistoryId - 1)
;

The downside to this approach is you have to write a case expression for every field in every history table. Not good. One way to avoid all this typing is to use dynamic SQL. But that can be complicated to compose!

Final Thoughts

I don't think SQL is well suited to tasks like this. Because of its set based nature, looping over fields and conditionally returning values requires a lot of effort. You could achieve the same results, with far less effort, in C#, Java, etc.

SQL Server 2016 includes Change Data Capture out of the box, for all editions. CDC is included in SQL Server 2008, 2012 and 2014 Enterprise and Developer editions. Where possible; I prefer to work with native functionality. Hand-rolled solutions require administration, and often lack the advanced features added by the developer.

Upvotes: 0

Related Questions