Chris Schmidt
Chris Schmidt

Reputation: 398

How to determine what's changed between database records

Presume first, that the following table exists in a MySQL Database

|----|-----|-----|----|----|-----------|--------------|----|
| id | rid | ver | n1 | n2 | s1        | s2           | b1 |
|----|-----|-----|----|----|-----------|--------------|----|
| 1  | 1   | 1   | 0  | 1  | Hello     | World        | 0  |
| 2  | 1   | 2   | 1  | 1  | Hello     | World        | 0  |
| 3  | 1   | 3   | 0  | 0  | Goodbye   | Cruel World  | 0  |
| 4  | 2   | 1   | 0  | 0  | Hello     | Doctor       | 1  | 
| 5  | 2   | 2   | 0  | 0  | Hello     | Nurse        | 1  |
| 6  | 3   | 1   | 0  | 0  | Dippity   | Doo-Dah      | 1  |
|----|-----|-----|----|----|-----------|--------------|----|

Question

How do I write a query to determine whether for any given rid, what changed between the most recent version and the version immediately preceding it (if any) such that it produces something like this:

|-----|-----------------|-----------------|-----------------|
| rid | numbers_changed | strings_changed | boolean_changed |
|-----|-----------------|-----------------|-----------------|
| 1   | TRUE            | TRUE            | FALSE           |
| 2   | FALSE           | TRUE            | FALSE           |
| 3   | n/a             | n/a             | n/a             |
|-----|-----------------|-----------------|-----------------|

I think that I should be able to do this by doing a cross-join between the table and itself but I can't resolve how to perform this join to get the desired output.

I need to generate this "report" for a table with 10's of columns and 1-10 versions of 100's of records (resulting in 1000's of rows). Note the particular design of the database is not my own and altering the structure of the database (at this time) is not an acceptable approach.

The actual format of the output isn't important - and if it simplifies the query getting a "full breakdown" of what changed for each "change set" would also be acceptable, for example

|-----|-----|-----|----|----|----|----|----|
| rid | old | new | n1 | n2 | s1 | s2 | b1 |
|-----|-----|-----|----|----|----|----|----|
| 1   | 1   | 2   | Y  | N  | N  | N  | N  |
| 1   | 2   | 3   | Y  | Y  | Y  | Y  | N  |
| 2   | 4   | 5   | N  | N  | N  | Y  | N  |
|-----|-----|-----|----|----|----|----|----|

Note that it is also ok, in this case to omit rid records which only have a single version, as for the purposes of this report I only care about records that have changed and getting a separate list of records that haven't changed is an easy query

Upvotes: 1

Views: 46

Answers (1)

Paul Spiegel
Paul Spiegel

Reputation: 31772

You can join every row with the following one with

select *
from history h1
join history h2
  on  h2.rid = h1.rid
  and h2.id = (
    select min(h.id)
    from history h
    where h.rid = h1.rid
      and h.id > h1.id
  );

Then you just need to compare every column from the two rows like h1.n1 <> h2.n1 as n1.

The full query would be:

select h1.rid, h1.id as old, h2.id as new
  , h1.n1 <> h2.n1 as n1
  , h1.n2 <> h2.n2 as n2
  , h1.s1 <> h2.s1 as s1
  , h1.s2 <> h2.s2 as s2
  , h1.b1 <> h2.b1 as b1
from history h1
join history h2
  on  h2.rid = h1.rid
  and h2.id = (
    select min(h.id)
    from history h
    where h.rid = h1.rid
      and h.id > h1.id
  );

Result:

| rid | old | new | n1 | n2 | s1 | s2 | b1 |
|-----|-----|-----|----|----|----|----|----|
|   1 |   1 |   2 |  1 |  0 |  0 |  0 |  0 |
|   1 |   2 |   3 |  1 |  1 |  1 |  1 |  0 |
|   2 |   4 |   5 |  0 |  0 |  0 |  1 |  0 |

Demo: http://sqlfiddle.com/#!9/2e5d12/5

If the columns can contain NULLs, You might need something like NOT h1.n1 <=> h2.n1 as n1. <=> is a NULL-save equality check.

If the version within a rid group is guaranteed to be consecutive, you can simplify the JOIN to

from history h1
join history h2
  on  h2.rid = h1.rid
  and h2.ver = h1.ver + 1

Demo: http://sqlfiddle.com/#!9/2e5d12/7

Upvotes: 2

Related Questions