jjames
jjames

Reputation: 599

Execute deferred trigger only once per row in PostgreSQL

I have a deferred AFTER UPDATE trigger on a table, set to fire when a certain column is updated. It's an integer type I'm using as a counter.

I'm not 100% certain but it looks like if I increment that particular column 100 times during a transaction, the trigger is queued up and executed 100 times at the end of the transaction.

I would like the trigger to only be scheduled once per row no matter how many times I've incremented that column.

Can I do that somehow? Alternatively if triggered triggers must queue up regardless if they are duplicates, can I clear this queue during the first run of the trigger?

Version of Postgres is 9.1. Here's what I got:

CREATE CONSTRAINT TRIGGER counter_change
    AFTER UPDATE OF "Counter" ON "table"
    DEFERRABLE INITIALLY DEFERRED
    FOR EACH ROW
    EXECUTE PROCEDURE counter_change();

CREATE OR REPLACE FUNCTION counter_change()
    RETURNS trigger
    LANGUAGE plpgsql
    AS $$
DECLARE
BEGIN

PERFORM some_expensive_procedure(NEW."id");

RETURN NEW;

END;$$;

Upvotes: 15

Views: 13771

Answers (4)

Matt Whitlock
Matt Whitlock

Reputation: 897

pilcrow's answer is good, but what if you want to avoid the overhead of executing a PL/pgSQL function FOR EACH ROW that is touched? You can't have a CONSTRAINT trigger that is also a FOR EACH STATEMENT trigger. A solution is to push the deferred constraint trigger down by one level…

CREATE FUNCTION defer_once_trigger()
    RETURNS trigger
    LANGUAGE plpgsql
AS $$
BEGIN
    BEGIN
        CREATE TEMPORARY TABLE deferred_once_trigger (
                "id" integer NOT NULL PRIMARY KEY
            )
            ON COMMIT DROP;
        CREATE CONSTRAINT TRIGGER deferred_once_trigger
            AFTER INSERT ON pg_temp.deferred_once_trigger
            DEFERRABLE INITIALLY DEFERRED
            FOR EACH ROW
            EXECUTE FUNCTION deferred_once_trigger();
    EXCEPTION
        WHEN duplicate_table THEN
            NULL;
    END;
    CASE TG_OP
        WHEN 'INSERT' THEN
            INSERT INTO pg_temp.deferred_once_trigger
                SELECT DISTINCT "id"
                    FROM new
                ON CONFLICT ("id") DO NOTHING;
        WHEN 'UPDATE' THEN
            INSERT INTO pg_temp.deferred_once_trigger
                SELECT "id"
                    FROM old
                UNION
                SELECT "id"
                    FROM new
                ON CONFLICT ("id") DO NOTHING;
        WHEN 'DELETE' THEN
            INSERT INTO pg_temp.deferred_once_trigger
                SELECT DISTINCT "id"
                    FROM old
                ON CONFLICT ("id") DO NOTHING;
    END CASE;
    RETURN NULL;
END;
$$;

CREATE TRIGGER defer_once_trigger_insert
    AFTER INSERT ON my_table
    REFERENCING NEW TABLE AS new
    FOR EACH STATEMENT
    EXECUTE FUNCTION defer_once_trigger();

CREATE TRIGGER defer_once_trigger_update
    AFTER UPDATE ON my_table
    REFERENCING OLD TABLE AS old
        NEW TABLE AS new
    FOR EACH STATEMENT
    EXECUTE FUNCTION defer_once_trigger();

CREATE TRIGGER defer_once_trigger_delete
    AFTER DELETE ON my_table
    REFERENCING OLD TABLE AS old
    FOR EACH STATEMENT
    EXECUTE FUNCTION defer_once_trigger();

The defer_once_trigger() function is called only once per DML statement affecting my_table rather than once per affected row. That could translate to sizable performance gains if you have statements that affect many rows multiple times in the same transaction.

As in pilcrow's answer, the defer_once_trigger() function creates a temporary table to track the affected rows by primary key. If the table creation succeeds, then it also adds a deferrable constraint trigger to the temporary table. In any case, the function then inserts the IDs of all the affected rows into the temporary table, skipping the ones that are already present. The server automatically schedules deferred calls to the deferred_once_trigger() function for each distinct ID that is inserted into the temporary table.

Note, because the temporary table is created in a connection-local temporary schema, there will never be any collisions with other concurrent transactions since each connection can have at most one open transaction at any given time. (The pg_temp schema name that is used to qualify the temporary table name is actually an alias that dynamically resolves to the server-assigned unique temporary schema name for the current connection.)

Although you can't use UPDATE OF … on a trigger that requests transition relations, you can perform equivalent filtering in the WHEN 'UPDATE' branch of CASE TG_OP. For example:

WHEN 'UPDATE' THEN
    INSERT INTO pg_temp.deferred_once_trigger
        SELECT DISTINCT "id"
            FROM old
                FULL JOIN new USING ("id")
            WHERE old.counter IS DISTINCT FROM new.counter
        ON CONFLICT ("id") DO NOTHING;

Upvotes: 0

alecov
alecov

Reputation: 5171

This cannot be done ordinarily, you need some trick to do it.

For example, consider a balances(account_id, balance) table containing balances such that you don't want any balance to go negative at the end of a transaction, but it can go negative during a transaction due to eg. partial updates to the table.

If you do an ordinary balance >= 0 check, it cannot be deferred and will not work. If you create a deferred constraint trigger and check for new.balance >= 0, it will not work either, because the value for new is fixed at the time the trigger is scheduled, not when it is executed.

Hence, a potential solution is to actually query the table in the trigger function:

create function check_balance_trigger()
returns trigger language plpgsql as $$
begin
    -- This queries the table at the time the trigger is executed:
    select * from balances into new where account_id = new.account_id;
    if new.balance < 0 then
        raise 'Balance cannot be negative: %, %', new.account_id, new.balance;
    end if;
    return new;
end $$;

create constraint trigger check_balance
after insert or update on balances deferrable initially deferred
for each row execute function check_balance_trigger();

Upvotes: 1

Erwin Brandstetter
Erwin Brandstetter

Reputation: 658082

This is a tricky problem. But it can be done with per-column triggers and conditional trigger execution introduced in PostgreSQL 9.0.

You need an "updated" flag per row for this solution. Use a boolean column in the same table for simplicity. But it could be in another table or even a temporary table per transaction.

The expensive payload is executed once per row where the counter is updated (once or multiple time).

This should also perform well, because ...

  • ... it avoids multiple calls of triggers at the root (scales well)
  • ... does not change additional rows (minimize table bloat)
  • ... does not need expensive exception handling.

Consider the following

Demo

Tested in PostgreSQL 9.1 with a separate schema x as test environment.

Tables and dummy rows

-- DROP SCHEMA x;
CREATE SCHEMA x;

CREATE TABLE x.tbl (
 id int
,counter int
,trig_exec_count integer  -- for monitoring payload execution.
,updated bool);

Insert two rows to demonstrate it works with multiple rows:

INSERT INTO x.tbl VALUES
 (1, 0, 0, NULL)
,(2, 0, 0, NULL);

Trigger functions and Triggers

1.) Execute expensive payload

CREATE OR REPLACE FUNCTION x.trg_upaft_counter_change_1()
    RETURNS trigger AS
$BODY$
BEGIN

 -- PERFORM some_expensive_procedure(NEW.id);
 -- Update trig_exec_count to count execution of expensive payload.
 -- Could be in another table, for simplicity, I use the same:

UPDATE x.tbl t
SET    trig_exec_count = trig_exec_count + 1
WHERE  t.id = NEW.id;

RETURN NULL;  -- RETURN value of AFTER trigger is ignored anyway

END;
$BODY$ LANGUAGE plpgsql;

2.) Flag row as updated.

CREATE OR REPLACE FUNCTION x.trg_upaft_counter_change_2()
    RETURNS trigger AS
$BODY$
BEGIN

UPDATE x.tbl
SET    updated = TRUE
WHERE  id = NEW.id;
RETURN NULL;

END;
$BODY$ LANGUAGE plpgsql;

3.) Reset "updated" flag.

CREATE OR REPLACE FUNCTION x.trg_upaft_counter_change_3()
    RETURNS trigger AS
$BODY$
BEGIN

UPDATE x.tbl
SET    updated = NULL
WHERE  id = NEW.id;
RETURN NULL;

END;
$BODY$ LANGUAGE plpgsql;

Trigger names are relevant! Called for the same event they are executed in alphabetical order.

1.) Payload, only if not "updated" yet:

CREATE CONSTRAINT TRIGGER upaft_counter_change_1
    AFTER UPDATE OF counter ON x.tbl
    DEFERRABLE INITIALLY DEFERRED
    FOR EACH ROW
    WHEN (NEW.updated IS NULL)
    EXECUTE PROCEDURE x.trg_upaft_counter_change_1();

2.) Flag row as updated, only if not "updated" yet:

CREATE TRIGGER upaft_counter_change_2   -- not deferred!
    AFTER UPDATE OF counter ON x.tbl
    FOR EACH ROW
    WHEN (NEW.updated IS NULL)
    EXECUTE PROCEDURE x.trg_upaft_counter_change_2();

3.) Reset Flag. No endless loop because of trigger condition.

CREATE CONSTRAINT TRIGGER upaft_counter_change_3
    AFTER UPDATE OF updated ON x.tbl
    DEFERRABLE INITIALLY DEFERRED
    FOR EACH ROW
    WHEN (NEW.updated)                 --
    EXECUTE PROCEDURE x.trg_upaft_counter_change_3();

Test

Run UPDATE & SELECT separately to see the deferred effect. If executed together (in one transaction) the SELECT will show the new tbl.counter but the old tbl2.trig_exec_count.

UPDATE x.tbl SET counter = counter + 1;

SELECT * FROM x.tbl;

Now, update the counter multiple times (in one transaction). The payload will only be executed once. Voilá!

UPDATE x.tbl SET counter = counter + 1;
UPDATE x.tbl SET counter = counter + 1;
UPDATE x.tbl SET counter = counter + 1;
UPDATE x.tbl SET counter = counter + 1;
UPDATE x.tbl SET counter = counter + 1;

SELECT * FROM x.tbl;

Upvotes: 18

pilcrow
pilcrow

Reputation: 58681

I don't know of a way to collapse trigger execution to once per (updated) row per transaction, but you can emulate this with a TEMPORARY ON COMMIT DROP table which tracks those modified rows and performs your expensive operation only once per row per tx:

CREATE OR REPLACE FUNCTION counter_change() RETURNS TRIGGER
AS $$
BEGIN
  -- If we're the first invocation of this trigger in this tx,
  -- make our scratch table.  Create unique index separately to
  -- suppress avoid NOTICEs without fiddling with log_min_messages
  BEGIN
    CREATE LOCAL TEMPORARY TABLE tbl_counter_tx_once
      ("id" AS_APPROPRIATE NOT NULL)
      ON COMMIT DROP;
    CREATE UNIQUE INDEX ON tbl_counter_tx_once AS ("id");
  EXCEPTION WHEN duplicate_table THEN
    NULL;
  END;

  -- If we're the first invocation in this tx *for this row*,
  -- then do our expensive operation.
  BEGIN
    INSERT INTO tbl_counter_tx_once ("id") VALUES (NEW."id");
    PERFORM SOME_EXPENSIVE_OPERATION_HERE(NEW."id");
  EXCEPTION WHEN unique_violation THEN
    NULL;
  END;

  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

There's of course a risk of name collision with that temporary table, so choose judiciously.

Upvotes: 9

Related Questions