robert
robert

Reputation: 516

Locking in Postgres function

Let's say I have a transactions table and transaction_summary table. I have created following trigger to update transaction_summary table.

CREATE OR REPLACE FUNCTION doSomeThing() RETURNS TRIGGER AS
$BODY$
DECLARE 
rec_cnt bigint;
BEGIN
    -- lock rows which have to be updated
    SELECT count(1) from (SELECT 1 FROM transaction_summary WHERE receiver = new.receiver FOR UPDATE) r INTO rec_cnt ;    
    IF rec_cnt = 0 
    THEN 
        -- if there are no rows then create new entry in summary table
        -- lock whole table
        LOCK TABLE "transaction_summary" IN ACCESS EXCLUSIVE MODE;
        INSERT INTO transaction_summary( ... ) VALUES ( ... );
    ELSE
        UPDATE transaction_summary SET ... WHERE receiver = new.receiver;
    END IF;

    SELECT count(1) from (SELECT 1 FROM transaction_summary WHERE sender = new.sender FOR UPDATE) r INTO rec_cnt ;    
    IF rec_cnt = 0 
    THEN 
        LOCK TABLE "transaction_summary" IN ACCESS EXCLUSIVE MODE;
        INSERT INTO transaction_summary( ... ) VALUES ( ... );
    ELSE
        UPDATE transaction_summary SET ... WHERE sender = new.sender;
    END IF;   

    RETURN new;
END;
$BODY$
language plpgsql;

Question: Will there be a dead lock? According to my understanding deadlock it might happen like this:

_________
|__table__| <- executor #1 waits on executor #2 to be able to lock the whole table AND 
|_________| executor #2 waits on executor #1 to be able to lock the whole table
|_________|
|_________| <- row is locked by executor #1
|_________| 
|_________| <- row is locked by executor #2

It seems that only option is to lock the whole table every time in transaction beginning.

Upvotes: 2

Views: 2067

Answers (1)

Alex Hayward
Alex Hayward

Reputation: 216

Are your 'SELECT 1 FROM transactions WHERE ...' meant to access 'transactions_summary' instead? Also, notice that those two queries can at least theoretically deadlock each other if two DB transactions are inserting two 'transactions' rows, with new.sender1=new.receiver2 and new.receiver1=new.sender2.

You can't, in general, guarantee that you won't get a deadlock from a database. Even if you try and prevent them by writing your queries carefully (eg, ordering updates) you can still get caught out because you can't control the order of INSERT/UPDATE, or of constraint checks. In any case, comparing every transaction against every other to check for deadlocks doesn't scale as your application grows.

So, your code should always be prepared to re-run transactions when you get 'deadlock detected' errors. If you do that and you think that conflicting transactions will be uncommon then you might as well let your deadlock handling code deal with it.

If you think deadlocks will be common then it might cause you a performance problem - although contending on a big table lock could be, too. Here are some options:

  • If new.receiver and new.sender are, for example, the IDs of rows in a MyUsers table, you could require all code which inserts into 'transactions_summary' to first do 'SELECT 1 FROM MyUsers WHERE id IN (user1, user2) FOR UPDATE'. It'll break if someone forgets, but so will your table locking. By doing it that way you'll swap one big table lock for many separate row locks.
  • Add UNIQUE constraints to transactions_summary and look for the error when it's violated. You should probably add constraints anyway, even if you handle this another way. It'll detect bugs.
  • You could allow duplicate transaction_summary rows, and require users of that table to add them up. Messy, and easy for developers who don't know to create bugs (though you could add a view which does the adding). But if you really can't take the performance hit of locking and deadlocks you could do it.
  • You could try the SERIALIZABLE transaction isolation level and take out the table locks. By my reading, the SELECT ... FOR UPDATE should create a predicate lock (and so should a plain SELECT). That'd stop any other transaction that does a conflicting insert from committing successfully. However, using SERIALIZABLE throughout your application will cost you performance and give you a lot more transactions to retry.

Here's how SERIALIZABLE transaction isolation level works:

create table test (id serial, x integer, total integer); ...

Transaction 1:

DB=# begin transaction isolation level serializable;
BEGIN
DB=# insert into test (x, total) select 3, 100 where not exists (select true from test where x=3);
INSERT 0 1
DB=# select * from test;
id | x | total
----+---+-------
1 | 3 |   100
(1 row)
DB=# commit;
COMMIT

Transaction 2, interleaved line for line with the first:

DB=# begin transaction isolation level serializable;
BEGIN
DB=# insert into test (x, total) select 3, 200 where not exists (select true from test where x=3);
INSERT 0 1
DB=# select * from test;
 id | x | total
----+---+-------
  2 | 3 |   200
(1 row)
DB=# commit;
ERROR:  could not serialize access due to read/write dependencies among transactions
DETAIL:  Reason code: Canceled on identification as a pivot, during commit attempt.
HINT:  The transaction might succeed if retried.

Upvotes: 4

Related Questions