Ben
Ben

Reputation: 21

Concept of "version control" for database table rows (Not referring to storing scripts in GIT/SVN)

I require a data store that will maintain not only a history of changes made to data (easy to do) but also store any number of proposed changes to data, including chained proposals (ie. proposal-on-proposal).

Think of these "changes" as really long-running transactions which are saved to the database and have a lifespan of anywhere between minutes and years.

They are created (proposed) and then either rolled back (essentially deleted) or committed, when committed they become the effective data visible to 3rd parties.

Of course this all requires some form of conflict resolution as proposed changes can be in contradictory states (eg. Change A proposes to delete a record but change B proposes to update it - if change A is committed first then change B will have to revert)

I have found no off-the-shelf product that can do this. The closest was Oracle Workspace Manager but it did not provide for change-on-change or the ability to see proposed deletes. The only way I have been able to achieve this is to have a set of common columns on my versioned tables:

Root ID: Required - set once to the same value as the primary key when the first version of a record is created. This represents the primary key across all of time and is copied into each version of the record. You should consider the Root ID when naming relation columns (eg. PARENT_ROOT_ID instead of PARENT_ID). As the Root ID is also the primary key of the initial version, foreign keys can be created against the actual primary key - the actual desired row will be determined by the version filters defined below.

Change ID: Required - every record is created, updated, deleted via a change

Copied From ID: Nullable - null indicates newly created record, not-null indicates which record ID this row was cloned/branched from when updated/deleted

Effective From Date/Time: Nullable - null indicates proposed record, not-null indicates when the record became current. Unfortunately a unique index cannot be placed on Root ID/Effective From as there can be multiple null values for any Root ID. (Unless you want to restrict yourself to a single proposed change per record)

Effective To Date/Time: Nullable - null indicates current or proposed, not-null indicates when it became historical. Not technically required but helps speed up queries finding the current data. This field could be corrupted by hand-edits but can be rebuilt from the Effective From Date/Time if this occurs.

Delete Flag: Boolean - set to true when it is proposed that the record be deleted upon becoming current. When deletes are committed, their Effective To Date/Time is set to the same value as the Effective From Date/Time, filtering them out of the current data set.

The query to get the current state of data at a point in time would be;

SELECT * FROM table WHERE EFFECTIVE_FROM <= :Now AND (EFFECTIVE_TO IS NULL OR EFFECTIVE_TO > :Now)

The query to get the current state of data according to a change would be;

SELECT * FROM table WHERE (CHANGE_ID IN :ChangeIds OR (EFFECTIVE_FROM <= :Now AND (EFFECTIVE_TO IS NULL OR EFFECTIVE_TO > :Now) AND ROOT_ID NOT IN (SELECT ROOT_ID FROM table WHERE CHANGE_ID IN :ChangeIds)))

Note that this 2nd query contains the 1st time-based query to overlay the current data with the proposed changed data.

The change ID column refers to the primary key of a change table which also contains a parent ID column (nullable) providing the change-on-change functionality. Hence the 2nd query refers to change IDs not a single change ID. I am filtering multiple versions in a change-on-change scenario in the client and not using SQL so it's not seen in those queries (The client has a linked list of change IDs in memory and if more than 1 version of a row is retrieved it uses the linked list to determine which version to use).

Does anybody know of an off-the-shelf product that I could use? It is a large amount of work handling this versioning myself and introduces all manner of issues.

Upvotes: 0

Views: 865

Answers (1)

Ben
Ben

Reputation: 21

There does not appear to be any off-the-shelf database or database plugin that does what I need. So I ended up utilising Oracle features to implement a solution.

The final table structure is slightly different - "Delete Flag" turned into "Change Action" which is either Add, Remove or Modify.

A global temporary table was used to store the current connection change identifier/date-time settings and a stored procedure created to populate it after connecting. This is referred to as 'context'.

Views joining versioned tables to this temporary, connection-specific context table are created programmatically for every versioned table, including instead-of insert/update/delete triggers which perform the required data versioning.

The result is that you treat the versioned tables like normal tables (and don't use the suffix _ROOT_ID for foreign keys) for select, insert, update and delete.

Only the Change Action is returned in the views and this is the only field that distinguishes a versioned table from a normal one.

Revert (which doesn't have a SQL keyword) is achieved by a double-delete. That is, if we update a record and then want to undo that update, we issue a delete command which deletes the proposed row and the record reverts to the current version. It's the most fitting SQL keyword - the alternative is to make a specific revert stored procedure.

A virtual Change Action of None exists in the views which indicates the record is not affected by the current context.

This all works quite effectively making the concept of versioning largely transparent, the only custom action required is setting the connection after connecting to the database.

Upvotes: 1

Related Questions