Oeiloff
Oeiloff

Reputation: 77

how to make history of sql data? (report data changes)

Every day, I am saving (with crontab, php script) into database bugs information. Every row is like: (Bugidentification, Date, Title, Who, etc....)

(e.g: Bugidentification, Date, Title, Who, etc....

issue1, 2015-04-01, blabla, bill, etc...

issue2, 2015-04-01, nnnnnnn, john, etc...

issue3, 2015-04-01, vvvvvvv, greg, etc...

issue1, 2015-04-02, blabla, bill, etc...

issue2, 2015-04-02, nnnnnnn, john, etc...

issue3, 2015-04-02, vvvvvvv, mario, etc... (here it is now mario)

issue2, 2015-04-03, nnnnnnn, john, etc... (issue1 dissapeared)

issue3, 2015-04-03, vvvvvvv, tod, etc... (tod is new info)

issue4, 2015-04-03, rrrrrrrr, john, etc... (issue4 is new)

............................................. )

Basically if I take example I posted above, results should be something like for comparison between date of April 2nd and April 3rd

New row is : issue4

Closed row is : Issue1

Updated row is : Issue3 (with tod instead of mario)

No change row is : Issue2

In my case there are hundreds of rows and I believe I know how to do it thanks to php, but my code will be long like creating foreach loops and see one by one if any change. I am not sure I am getting straightforward solution.

So my question is, is there any simple way to report those changes with "simple" code (like sql special request or any project code out there or simple php functions?).

Upvotes: 1

Views: 577

Answers (1)

TommCatt
TommCatt

Reputation: 5636

There are way too many assumptions built into this design. And those assumptions require you to compare rows between different days to make the assumption in the first place -- not to mention you have to duplicate unchanged rows from one day to the next in order to maintain the unbroken daily entry needed to feed the assumptions. Whew.

Rule 1: don't build assumptions into the design. If something is new, it should be marked, "HEY! I'm new here!" When a change has been made to the data, "OK, something changed. Here it is." and when the issue has finally been closed, "OK, that's it for me. I'm done for."

create table Bug_Static( -- Contains one entry for each bug
    ID     int identity,
    Opened date not null default sysdate,
    Closed date [null | not null default date '9999-12-31'],
    Title  varchar(...),
    Who    id references Who_Table,
    <other non-changing data>,
    constraint PK_Bug_Static primary key( ID )
);
create table Bug_Versions( -- Contains changing data, like status
    ID        int not null,
    Effective date not null,
    Status    varchar not null,  -- new,assigned,in work,closed,whatever
    <other data that may change from day to day>,
    constraint PK_Bug_Versions primary key( ID, Effective ),
    constraint FK_Bug_Versions_Static foreign key( ID )
        references Bug_Static( ID )
);

Now you can select the bugs and the current data (the last change made) on any given day.

select  s.ID, s.Opened, s.Title, v.Effective, v.Status
from    Bug_Static   s
join    Bug_Versions v
    on  v.ID = s.ID
    and v.Effective =(
        select  Max( Effective )
        from    Bug_Versions
        where   ID = v.ID
            and Effective <= sysdate )
where   s.Closed < sysdate;

The where s.Closed < sysdate is optional. What that gives you is all the bugs that were closed on the date the query is executed, but not the ones closed before then. That keeps the closed bugs from reappearing over and over again -- unless that's what you want.

Change the sysdate values to a particular date/time and you will get the data as it appeared as of that date and time.

Normally, when a bug is created, a row is entered into both tables. Then only new versions are entered as the status or any other data changes. If nothing changed on a day, nothing is entered. Then when the bug is finally closed, the Closed field of the static table is updated and a closed version is inserted into the version table. I've shown the Closed field with two options, null or with the defined "maximum date" of Dec 31, 9999. You can use either one but I like the max date method. It simplifies the queries.

I would also front both tables with a couple of views which joins the tables. One which shows only the last versions of each bug (Bug_Current) and one which shows every version of every bug (Bug_History). With triggers on Bug_Current, it can be the one used by the app to change the bugs. It would change, for instance, an update of any versioned field to an insert of a new version.

The point is, this is a very flexible design which you can easily show just the data you want, how you want it, as of any time you want.

Upvotes: 1

Related Questions