Reputation: 40860
I want to add revisioning for records in an existing application which stores data in a PostgreSQL database. I read about strategies e.g. in this question, this question and this blog post.
I think that the approach to create a second history table which will rarely be queried will work best. However I do have some practical problems. Let's say that this is my table I want to add revision control to:
create table people(
id serial not null primary key,
name varchar(255) not null
);
For this very simple table my history table could look like this:
create table people_history(
peopleId int not null references people(id) on delete cascade on update restrict,
revision int not null,
revisionTimestamp timestamptz not null default current_timestamp,
name character varying(255) not null,
primary key(peopleId, revision)
);
And this brings the first problems up:
Of course I could create a sequence from which I request revision numbers which would be easy. However that would leave large gaps between revisions per person as many people share the same sequence and it would feel more natural if the revision numbers were ascending numbers without gaps per person.
So I am tempted to find my revision number by select max(revision)+1 from ... where peopleId=...
. However that could lead to a race condition if two threads ask for the next revision number and try to insert. That is very unlikely I have to admit (especially in my case where only few updates happen anyway) and would not cause data to corrupt as that would be a duplicate primary key and thus cause a transaction rollback, but it is not pretty either. I wonder if there is a prettier solution.
Two ways come to mind: Manually on every statement that updates the main table or using a trigger. A trigger sounds less error-prone as it is less likely that I forget about a query somewhere. However I cannot communicate to the application exactly which revision number was just created, can I? So if I want to create a couple of event tables like this:
create table peopleUserEditEvent (
poepleId int not null,
revision int not null,
userId int not null references users(id) on delete set null on update restrict,
comment text not null default '',
primary key(paopleId, revision),
foreign key (peopleId, revision) references people_history
);
That lists some metadata for revisions which explains why the revision was changed. In this case a user with a specific ID edited the data and might have supplied a comment.
In another case (and another event table) a cronjob might have changed something and documents the event which probably has no userId and no comment but other metadata.
To add those event data I need the revision id and if the revision id was created by a trigger it will be difficult to find out (or is there a practical way to do so?).
Upvotes: 3
Views: 4823
Reputation: 462
Well, you need one replication strategy for all tables and column you have , you can create one table to maintain all changes and insert on anytime you make a UPDATE INSERT or DELETE statement, maybe with this exemple of framwork idempiere changelog can help you
CREATE TABLE ad_changelog (
ad_changelog_id NUMERIC(10,0) NOT NULL,
ad_session_id NUMERIC(10,0) NOT NULL,
ad_table_id NUMERIC(10,0) NOT NULL,
ad_column_id NUMERIC(10,0) NOT NULL,
isactive CHAR(1) DEFAULT 'Y'::bpchar NOT NULL,
created TIMESTAMP WITHOUT TIME ZONE DEFAULT now() NOT NULL,
createdby NUMERIC(10,0) NOT NULL,
updated TIMESTAMP WITHOUT TIME ZONE DEFAULT now() NOT NULL,
updatedby NUMERIC(10,0) NOT NULL,
record_id NUMERIC(10,0) NOT NULL,
oldvalue VARCHAR(2000),
newvalue VARCHAR(2000),
undo CHAR(1),
redo CHAR(1),
iscustomization CHAR(1) DEFAULT 'N'::bpchar NOT NULL,
description VARCHAR(255),
ad_changelog_uu VARCHAR(36) DEFAULT NULL::character varying,
CONSTRAINT adcolumn_adchangelog FOREIGN KEY (ad_column_id)
REFERENCES adempiere.ad_column(ad_column_id)
MATCH PARTIAL
ON DELETE CASCADE
ON UPDATE NO ACTION
DEFERRABLE
INITIALLY DEFERRED,
CONSTRAINT adsession_adchangelog FOREIGN KEY (ad_session_id)
REFERENCES adempiere.ad_session(ad_session_id)
MATCH PARTIAL
ON DELETE NO ACTION
ON UPDATE NO ACTION
DEFERRABLE
INITIALLY DEFERRED,
CONSTRAINT adtable_adchangelog FOREIGN KEY (ad_table_id)
REFERENCES adempiere.ad_table(ad_table_id)
MATCH PARTIAL
ON DELETE CASCADE
ON UPDATE NO ACTION
DEFERRABLE
INITIALLY DEFERRED
)
WITH (oids = false);
CREATE INDEX ad_changelog_speed ON adempiere.ad_changelog
USING btree (ad_table_id, record_id);
CREATE UNIQUE INDEX ad_changelog_uu_idx ON adempiere.ad_changelog
USING btree (ad_changelog_uu COLLATE pg_catalog."default");
Upvotes: 0