Postgres deadlock with read_commited isolation

Question

We have noticed a rare occurrence of a deadlock on a Postgresql 9.2 server on the following situation:

T1 starts the batch operation:

UPDATE BB bb SET status = 'PROCESSING', chunk_id = 0 WHERE bb.status ='PENDING' 
AND bb.bulk_id = 1 AND bb.user_id IN (SELECT user_id FROM BB WHERE bulk_id = 1 
AND chunk_id IS NULL AND status ='PENDING' LIMIT 2000)

When T1 commits after a few hundred milliseconds or so (BB has many millions of rows), multiple threads begin new Transactions (one transaction per thread) that read items from BB, do some processing and update them in batches of 50 or so with the queries:

For select:

SELECT *, RANK() as rno OVER(ORDER BY user_id) FROM BB WHERE status = 'PROCESSING' AND bulk_id = 1 and rno = $1

And Update:

UPDATE BB set datetime=$1, status='DONE', message_id=$2 WHERE bulk_id=1 AND user_id=$3

(user_id, bulk_id have a UNIQUE constraint).

Due to an external to the situation problem, another transaction T2 executes the same query with T1 almost immediately after T1 has committed (the initial batch operation where items are marked as 'PROCESSING').

UPDATE BB bb SET status = 'PROCESSING', chunk_id = 0 WHERE bb.status ='PENDING' 
AND bb.bulk_id = 1 AND bb.user_id IN (SELECT user_id FROM BB WHERE bulk_id = 1 
AND chunk_id IS NULL AND status ='PENDING' LIMIT 2000)

However although these items are marked as 'PROCESSING' this query deadlocks with some of the updates (which are done in batches as i said) off the worker threads. To my understanding this should not happen with READ_COMMITTED isolation level (default) that we use. I am sure that T1 has committed because the worker threads execute after it has done so.

edit: One thing i should clear up is that T2 starts after T1 but before it commits. However due to a write_exclusive tuple lock that we acquire with a SELECT for UPDATE on the same row (that is not affected by any of the above queries), it waits for T1 to commit before it runs the batch update query.

Postgres deadlock with read_commited isolation

Answers (1)

Related Questions