Reputation: 40653
What are the general strategies in DB design to maintain a revision history? If it were just one table I was dealing with, I think it wouldn't be so hard. Just save each update as a new record in the table. The last record will always be the latest revision.
But when the data is stored across multiple tables, what's a good way to design that so that it can track revisions?
Upvotes: 6
Views: 5963
Reputation: 25122
Datadiff. API powered DB revision tracking.
Full disclosure:
I built Datadiff. I needed a solution that provided a visual history of a data model in MongoDB for help supporting a SASS product. It will work with SQL databases too.
You can use do basic querying with key:val
notation. i.e id:123
Upvotes: 0
Reputation: 52157
The hard part is not the versioning of the "base" tables - you just version them individually as you would a single table in isolation.
The hard part is tracking connections between them.
How exactly are you going to do that depends on the requirements of the particular project. Here is an example of how sales orders could be "historized", but there are many other variations possible.
Upvotes: 2
Reputation: 4072
I prefer to have additional historical table for each versioned table. Same structure as main table with time_from
and time_to
additional fields.
Transparently filled with triggers. time_to
of latest revision set to far far future.
State for specified moment can be retrieved with query like this:
SELECT * FROM user_history
WHERE time_from >= '2012-02-01' AND time_to <= '2012-02-01'
As for me, storing history within main table is not generally a good idea, as it requires complicated conditions when retrieving or joining current data.
Upvotes: 4
Reputation: 22925
I am using approach, where each object that I'm dealing with has at least 1 so called instance table, where I keep the data that tends to change over time. Typically such tables follow the following concept:
_HISTORY
suffix in the name;start_dt
and end_dt
, indicating object instance's lifetime;start_dt
is NOT NULL
, end_dt
can be NULL
, which indicates that instance is current and is not limited in it's time;1/Jan-2013
, then you need to set end_dt
of the current instance to 31/Dec-2012 23:59:59
and insert a new record with start_dt
of 1/Jan-2013 00:00:00
;revision
field, if it is necessary to track revisions.In order to have a proper RI constraints with such design, I always have 2 tables for versioned obejcts. Say, for Customer
obejct I have the following set of tables:
customer (customer_id INTEGER, PRIMARY KEY (customer_id));
customer_history (customer_id INTEGER, start_dt TIMESTAMP, end_dt TIMESTAMP,
name VARCHAR(50), sex CHAR(1), ...,
PRIMARY KEY (customer_id, start_dt));
customer_bank_history (customer_id INTEGER, start_dt TIMESTAMP, end_dt TIMESTAMP,
bank_id INTEGER, iban VARCHAR(34));
In all other places I use customer(customer_id)
to build foreign keys. Querying actual customer details is simple:
SELECT c.customer_id, ch.name, ch.sex
FROM customer c
JOIN customer_history ch ON c.customer_id = ch.customer_id
AND now() BETWEEN ch.start_dt AND coalesce(end_dt, now());
Why I prefer such design:
Hope this will help you.
Upvotes: 1