Reputation: 1007
I am utterly confused as to how the locking mechanism in sqlalchemy-psql works. I am running a python-flask app with sqlalchemy and postgres. Since i have multiple threads processing some data and updating it on psql, I am getting the following deadlock:
2015-12-31 17:[ERROR] (raised as a result of Query-invoked autoflush; consider using a session.no_autoflush block if this flush is occurring prematurely) (psycopg2.extensions.TransactionRollbackError) deadlock detected
DETAIL: Process 22833 waits for ShareLock on transaction 14114188; blocked by process 19759.
Process 19759 waits for ShareLock on transaction 14114189; blocked by process 22833.
Is this how a deadlock may arise:
Thread 1 Thread 2
| (start an sqlalchemy session) |
db.session() db.session()
|(Using sqlalchemy) |
Update row1 of table1 Update row2 of table 1
| |
Update row2 of table1 Update row1 of table1
| |
session.commit() session.commit()
Here are some answers to my problem but i am unable to relate them to sqlalchemy.
Upvotes: 5
Views: 4316
Reputation: 2451
In PostgreSQL the rows will be locked as they are updated -- in fact, the way this actually works is that each tuple (version of a row) has a system field called xmin to indicate which transaction made that tuple current (by insert or update) and a system field called xmax to indicate which transaction expired that tuple (by update or delete). When you access data, it checks each tuple to determine whether it is visible to your transaction, by checking your active "snapshot" against these values.
If you are executing an UPDATE and a tuple which matches your search conditions has an xmin which would make it visible to your snapshot and an xmax of an active transaction, it blocks, waiting for that transaction to complete. If the transaction which first updated the tuple rolls back, your transaction wakes up and processes the row; if the first transaction commits, your transaction wakes up and takes action depending on the current transaction isolation level.
Obviously, a deadlock is the result of this happening to rows in different order. There is no row-level lock in RAM which can be obtained for all rows at the same time, but if rows are updated in the same order you can't have the circular locking. Unfortunately, the suggested IN(1, 2) syntax doesn't guarantee that. Different sessions may have different costing factors active, a background "analyze" task may change statistics for the table between the generation of one plan and the other, or it may be using a seqscan and be affected by the PostgreSQL optimization which causes a new seqscan to join one already in progress and "loop around" to reduce disk I/O.
read this http://elioxman.blogspot.in/2013/02/postgres-deadlock.html
Upvotes: 4