Insert in one table automatically inserts in other table first (PostgreSQL)

Question

I have a lot of tables in a complex schema, all generated pair-wise like this:

CREATE TABLE first_id (
    nid BIGSERIAL PRIMARY KEY
);

CREATE TABLE first (
    nid BIGINT REFERENCES example_id(nid),
    revisions TSRANGE NOT NULL,
    column1 TEXT
);

CREATE TABLE second_id (
    nid BIGSERIAL PRIMARY KEY
);

CREATE TABLE second (
    nid BIGINT REFERENCES second_id(nid),
    revisions TSRANGE NOT NULL,
    column2 BIGINT REFERENCES first_id(nid),
    column3 TEXT
);

Different tables have different columns apart from nid and revisions, which they all have.

The reason for this design is that we keep different revisions of the same row (via non-overlapping TSRANGE), but want references between different types to be between the abstract ID:s, not between specific revisions (we resolve revisions at question-time).

When we want to create a new first row, we first

INSERT INTO first_id DEFAULT VALUES RETURNING nid

then we fetch this ID into the calling code, and emit it again in a new query:

INSERT INTO first (nid, revisions, column1) 
  VALUES ([nid from previous query], TSRANGE(NOW()::TIMESTAMP, 'INFINITY'), 'A new string');

This is all fine when we create new ids one by one. But we now want to bulk-create ids, thousands at a time, from another query. Ordinarily we would simply do something like

INSERT INTO first (nid, revisions, column1)
  SELECT ???, TSRANGE(NOW()::TIMESTAMP, 'INFINITY'), somecolumn
    FROM sometable
   WHERE someexpression

This, of course, won't work - if we're inserting 234 rows, we need to do 234 inserts into firstname_id. We could use a query with multiple CTE:s that insert ids and then correlate on row_number or similar, but it becomes bulky and difficult to understand.

Instead we want a RULE or TRIGGER or similar functionality that works as follows:

When inserting into first:
  For each new row:
    If  is NULL:
      INSERT INTO first_id, keeping the new 
      Replace the new row's  with the value from the previous step
    Else:
      Keep the new row as is
    Insert the new row into first

We want this to work efficiently even if we run an INSERT ... SELECT ... that generates tens of thousands of new ids, and of course to work even when multiple transactions are doing it at the same time. Unfortunately I am not yet strong enough with the PostgreSQL force to know what is best, nor how it would work. Thankful as always for any and all input.

Christian · Accepted Answer

You can create a trigger in your tables, just like this:

CREATE OR REPLACE FUNCTION f_t_first()
  RETURNS trigger AS
$BODY$
BEGIN
  if (NEW.NID IS NULL) then
    INSERT INTO first_id DEFAULT VALUES RETURNING nid INTO NEW.NID;  
  end if;
  RETURN NEW;
END;
$BODY$
  LANGUAGE plpgsql VOLATILE;


CREATE TRIGGER t_first
  BEFORE INSERT
  ON first
  FOR EACH ROW
  EXECUTE PROCEDURE f_t_first();

And then just insert into "first" table without worry about "nid" calculation and "first_id" population. It will be done by the trigger itself.

INSERT INTO first (revisions, column1)
  SELECT TSRANGE(NOW()::TIMESTAMP, 'INFINITY'), somecolumn
    FROM sometable
   WHERE someexpression

Insert in one table automatically inserts in other table first (PostgreSQL)

Answers (1)

Related Questions