Tommy
Tommy

Reputation: 245

deadlock in postgresql trigger function

Using postgres 9.3, I have a table called regression_runs that stores some counters. When a row in this table is updated, inserted, or deleted, a trigger function is called to update a row in the nightly_runs table to keep a running total of those counters for all regression_runs with the given ID. The approach I've taken is fairly widely documented. My problem, however is that I'm running into deadlocks when multiple processes are trying to simultaneously insert new rows in the regression_runs table with the same nightly_run_id.

The regression_runs table looks like this:

regression=> \d regression_runs
                                      Table "public.regression_runs"
     Column      |           Type           |                          Modifiers                           
-----------------+--------------------------+--------------------------------------------------------------
 id              | integer                  | not null default nextval('regression_runs_id_seq'::regclass)
 username        | character varying(16)    | not null
 nightly_run_id  | integer                  | 
 nightly_run_pid | integer                  | 
 passes          | integer                  | not null default 0
 failures        | integer                  | not null default 0
 errors          | integer                  | not null default 0
 skips           | integer                  | not null default 0
Indexes:
    "regression_runs_pkey" PRIMARY KEY, btree (id)
    "regression_runs_nightly_run_id_idx" btree (nightly_run_id)
Foreign-key constraints:
    "regression_runs_nightly_run_id_fkey" FOREIGN KEY (nightly_run_id) REFERENCES nightly_runs(id) ON UPDATE CASCADE ON DELETE CASCADE
Triggers:
    regression_run_update_trigger AFTER INSERT OR DELETE OR UPDATE ON regression_runs FOR EACH ROW EXECUTE PROCEDURE regression_run_update()

The nightly_runs table looks like this:

regression=> \d nightly_runs
                                    Table "public.nightly_runs"
   Column   |           Type           |                         Modifiers                         
------------+--------------------------+-----------------------------------------------------------
 id         | integer                  | not null default nextval('nightly_runs_id_seq'::regclass)
 passes     | integer                  | not null default 0
 failures   | integer                  | not null default 0
 errors     | integer                  | not null default 0
 skips      | integer                  | not null default 0
Indexes:
    "nightly_runs_pkey" PRIMARY KEY, btree (id)
Referenced by:
    TABLE "regression_runs" CONSTRAINT "regression_runs_nightly_run_id_fkey" FOREIGN KEY (nightly_run_id) REFERENCES nightly_runs(id) ON UPDATE CASCADE ON DELETE CASCADE

The trigger function regression_run_update is this:

CREATE OR REPLACE FUNCTION regression_run_update() RETURNS "trigger"
    AS $$
        BEGIN
        IF TG_OP = 'UPDATE' THEN
                IF (NEW.nightly_run_id IS NOT NULL) and (NEW.nightly_run_id = OLD.nightly_run_id) THEN
                        UPDATE nightly_runs SET passes = passes + (NEW.passes - OLD.passes), failures = failures + (NEW.failures - OLD.failures), errors = errors + (NEW.errors - OLD.errors), skips = skips + (NEW.skips - OLD.skips) WHERE id = NEW.nightly_run_id;
                ELSE
                        IF NEW.nightly_run_id IS NOT NULL THEN
                                UPDATE nightly_runs SET passes = passes + NEW.passes, failures = failures + NEW.failures, errors = errors + NEW.errors, skips = skips + NEW.skips WHERE id = NEW.nightly_run_id;
                        END IF;
                        IF OLD.nightly_run_id IS NOT NULL THEN
                                UPDATE nightly_runs SET passes = passes - OLD.passes, failures = failures - OLD.failures, errors = errors - OLD.errors, skips = skips - OLD.skips WHERE id = OLD.nightly_run_id;
                        END IF;
                END IF;
        ELSIF TG_OP = 'INSERT' THEN
                IF NEW.nightly_run_id IS NOT NULL THEN
                        UPDATE nightly_runs SET passes = passes + NEW.passes, failures = failures + NEW.failures, errors = errors + NEW.errors, skips = skips + NEW.skips WHERE id = NEW.nightly_run_id;
                END IF;
        ELSIF TG_OP = 'DELETE' THEN
                IF OLD.nightly_run_id IS NOT NULL THEN
                        UPDATE nightly_runs SET passes = passes - OLD.passes, failures = failures - OLD.failures, errors = errors - OLD.errors, skips = skips - OLD.skips WHERE id = OLD.nightly_run_id;
                END IF;
        END IF;
        RETURN NEW;
        END;
$$  
    LANGUAGE plpgsql;

What I see in the postgres log file is this:

ERROR:  deadlock detected
DETAIL:  Process 20266 waits for ShareLock on transaction 7520; blocked by process 20263.
        Process 20263 waits for ExclusiveLock on tuple (1,70) of relation 18469 of database 18354; blocked by process 20266.
        Process 20266: insert into regression_runs (username, nightly_run_id, nightly_run_pid) values ('tbeadle', 135, 20262);
        Process 20263: insert into regression_runs (username, nightly_run_id, nightly_run_pid) values ('tbeadle', 135, 20260);
HINT:  See server log for query details.
CONTEXT:  SQL statement "UPDATE nightly_runs SET passes = passes + NEW.passes, failures = failures + NEW.failures, errors = errors + NEW.errors, skips = skips + NEW.skips WHERE id = NEW.nightly_run_id"
        PL/pgSQL function regression_run_update() line 16 at SQL statement
STATEMENT:  insert into regression_runs (username, nightly_run_id, nightly_run_pid) values ('tbeadle', 135, 20262);

I can reproduce the problem with this script:

#!/usr/bin/env python

import os
import multiprocessing
import psycopg2

class Foo(object):
    def child(self):
        pid = os.getpid()
        conn = psycopg2.connect(
            'dbname=regression host=localhost user=regression')
        cur = conn.cursor()
        for i in xrange(100):
            cur.execute(
                "insert into regression_runs "
                "(username, nightly_run_id, nightly_run_pid) "
                "values "
                "('tbeadle', %s, %s);", (self.nid, pid))
            conn.commit()
        return

    def start(self):
        conn = psycopg2.connect(
            'dbname=regression host=localhost user=regression')
        cur = conn.cursor()
        cur.execute('insert into nightly_runs default values returning id;')
        row = cur.fetchone()
        conn.commit()
        self.nid = row[0]
        procs = []
        for child in xrange(5):
            procs.append(multiprocessing.Process(target=self.child))
        for proc in procs:
            proc.start()
        for proc in procs:
            proc.join()

Foo().start()

I can not figure out why the deadlock is happening or what I can do about it. Please help!

Upvotes: 4

Views: 3751

Answers (1)

Denis de Bernardy
Denis de Bernardy

Reputation: 78571

More often than not, deadlocks occur because updates related to OLD and NEW are not enforced in a consistent order. Case in point:

IF TG_OP = 'UPDATE' THEN
  IF (NEW.nightly_run_id IS NOT NULL) AND (NEW.nightly_run_id = OLD.nightly_run_id) THEN
    -- stuff that seems fine
  ELSE
    IF NEW.nightly_run_id IS NOT NULL THEN
      UPDATE nightly_runs ... WHERE id = NEW.nightly_run_id; -- lock
    END IF;
    IF OLD.nightly_run_id IS NOT NULL THEN
      UPDATE nightly_runs ... WHERE id = OLD.nightly_run_id; -- lock
    END IF;

Imagine two transactions:

  • T1 acquires a lock on new.nightly_run_id = 1 and awaits a lock on old.nightly_run_id = 2
  • T2 acquires a lock on new.nightly_run_id = 2 and awaits a lock on old.nightly_run_id = 1

Deadlock...

Force the order to avoid this kind of situation:

IF OLD.nightly_run_id = NEW.nightly_run_id THEN
  -- stuff that seems fine
ELSIF OLD.nightly_run_id < NEW.nightly_run_id THEN
  UPDATE nightly_runs ... WHERE id = OLD.nightly_run_id;
  UPDATE nightly_runs ... WHERE id = NEW.nightly_run_id;
ELSEIF NEW.nightly_run_id < OLD.nightly_run_id THEN
  UPDATE nightly_runs ... WHERE id = NEW.nightly_run_id;
  UPDATE nightly_runs ... WHERE id = OLD.nightly_run_id;
ELSEIF OLD.nightly_run_id IS NOT NULL THEN
  UPDATE nightly_runs ... WHERE id = OLD.nightly_run_id;
ELSEIF NEW.nightly_run_id IS NOT NULL THEN
  UPDATE nightly_runs ... WHERE id = NEW.nightly_run_id;
END IF;

The same kind of change should occur for your other triggers where applicable. Bar other pathologies in your code, the deadlocks should then go away.

Upvotes: 5

Related Questions