RobStallion
RobStallion

Reputation: 1703

How to store table history in PostgreSQL

Scenario

We need to store record history in PostgreSQL such that when a record is inserted into or updated in a primary table (e.g: pets), it is automatically backed up to a history table (pets_history).

Ideally we need the history table to be generated based on the schema of the primary table without any human intervention.

INSERT INTO pets(name, species) VALUES ('Meowth', 'Cat')
pets:
+---+------------+-------------+
|id | name       | species     |
+---+------------+-------------+
| 1 | Meowth     | Cat         |
+---+------------+-------------+

A Trigger should automatically insert the record into pets_history:

pets_history:
+----+--------+-----------+---------+
| id | ref_id | name      | species |
+----+--------+-----------+---------+
| 1  | 1      | Meowth    | Cat     |
+----+--------+-----------+---------+

When an update is made to pets to change the name of my cat from Meowth to Persian. e.g:

UPDATE pets SET name = 'Persian' WHERE id = 1;
pets:
+---+------------+-------------+
|id | name       | species     |
+---+------------+-------------+
| 1 | Persian    | Cat         |
+---+------------+-------------+

I would like to end up with the following...

pets_history:
+----+--------+-----------+---------+
| id | ref_id | name      | species |
+----+--------+-----------+---------+
| 1  | 1      | Meowth    | Cat     |
| 2  | 1      | Persian   | Cat     |
+----+--------+-----------+---------+

Later on when another column/field is added to the pets table, e.g: color

pets:
+---+--------+---------+-------+
|id | name   | species | color |
+---+--------+---------+-------+
| 1 | Meowth | Cat     | cream |
+---+--------+---------+-------+

we want this to be reflected in the pets_history table automatically:

pets_history:
+----+--------+---------+---------+-------+
| id | ref_id | name    | species | color |
+----+--------+---------+---------+-------+
| 1  | 1      | Meowth  | Cat     | null  |
| 2  | 1      | Persian | Cat     | null  |
| 3  | 1      | Persian | Cat     | cream |
+----+--------+---------+---------+-------+

If anyone knows any way of doing this natively in PostgreSQL or otherwise then please share. We had a look at this question/answer Implementing history of PostgreSQL table which partially solves challenge, but it does not auto-create the _history table.

Upvotes: 11

Views: 13513

Answers (2)

Chris Johnson
Chris Johnson

Reputation: 21956

Depending on your choice of DBMS, you may be able to automate this.

Postgres has two constructs to help you: (1) event triggers -- that fire on DB-level events like creating and altering tables; and (2) trigger functions -- that can create triggers and apply them to tables.

At a high level, you could do something like this:

  1. Create a trigger function CreateHistoryTrigger that implements the logic to trigger on inserts and updates on a base table and insert into a corresponding history table. You can generalize this by creating parameters for the base table name, column names, and other options. When called, this function will devise a trigger dynamically and apply it to a named table.

  2. Create an event trigger ManageHistoryTables that watches for changes you're interested in, i.e. assuming you don't want history tables for every last table, you'd filter DB events by schema, naming convention, a specific list of tables you want to watch or whatever's appropriate. Probably you want to watch for at least CREATE TABLE, ALTER TABLE, and DROP TABLE.

  3. For CREATE TABLE or ALTER TABLE, check system tables to see if the history table corresponding to the base table exists. If it doesn't, create it by building the create or alter table command from column attributes of the base table discovered by querying the system tables, and running that.

  4. For ALTER TABLE, if the history table already exists, diff its current column definitions with the base table, then add / modify / or drop columns by building the alter table command from column attributes of the base table discovered by querying the system tables, and running that.

  5. For DROP TABLE, drop the history table (if that's what you want to do).

At this point, you have the right history table structure, but no trigger to populates that table structure.

  1. In your event trigger, for any of CREATE TABLE, ALTER TABLE, or DROP TABLE, call the CreateHistoryTrigger trigger function passing in the table name and column names you care about. That function needs to CREATE OR REPLACE the trigger on the table, since there could be a different rendition of this trigger from a prior version of the same table's structure.

Metaprogramming like this can be challenging but it's also a fun and "permanent" / reusable way to solve problems. If you can generalize the logic that you'd use in a manually-written trigger -- allow the table names and columns to be parameters -- this is probably not all that hard, and you'd be able to reuse the same event trigger and trigger function in your future projects. Nice thing to have in your toolkit.

Upvotes: 1

cxↄ
cxↄ

Reputation: 1348

You can use to_jsonb to keep whole row as JSON object in history table. In this case, you don't need to care about adding new columns in history table, because key of value will be column name.

Pets table

CREATE TABLE public.pets
(
  id serial NOT NULL,
  name text,
  species text,
  PRIMARY KEY (id)
);

Pets history table

CREATE TABLE public.h_pets
(
  id serial NOT NULL,
  target_row_id integer NOT NULL,
  executed_operation integer NOT NULL,
  operation_executed_at timestamp without time zone NOT NULL DEFAULT now(),
  data_after_executed_operation jsonb,
  PRIMARY KEY (id)
);

Function for adding row to history table

CREATE OR REPLACE FUNCTION public.on_content_change()
  RETURNS trigger
  LANGUAGE 'plpgsql'
AS $BODY$
  DECLARE
    target_history_table TEXT;
  BEGIN
    target_history_table := TG_ARGV[0];

    IF TG_OP = 'INSERT'
    THEN
      EXECUTE
        format(
          'INSERT INTO %I (target_row_id, executed_operation, data_after_executed_operation) VALUES ($1.id, 0, to_jsonb($1))',
          target_history_table
        )
        USING NEW;
      RETURN NEW;
    ELSIF TG_OP = 'UPDATE'
    THEN
      EXECUTE
        format(
          'INSERT INTO %I (target_row_id, executed_operation, data_after_executed_operation) VALUES ($1.id, 1, to_jsonb($1))',
          target_history_table
        )
        USING NEW;
      RETURN NEW;
    ELSIF TG_OP = 'DELETE'
    THEN
      EXECUTE
        format(
          'INSERT INTO %I (target_row_id, executed_operation) VALUES ($1.id, 2)',
          target_history_table
        )
        USING OLD;
      RETURN OLD;
    END IF;
  END;
$BODY$;

And trigger for pets table

CREATE TRIGGER pets_history_trigger
  BEFORE INSERT OR DELETE OR UPDATE
  ON public.pets
  FOR EACH ROW
  EXECUTE PROCEDURE public.on_content_change('h_pets');

Upvotes: 5

Related Questions