Vlad Mihalcea
Vlad Mihalcea

Reputation: 154070

Why does MVCC require locking for DML statements

In PostgreSQL, the MVCC concurrency control mechanism says that:

MVCC locks acquired for querying (reading) data do not conflict with locks acquired for writing data, and so reading never blocks writing and writing never blocks reading

So, even for READ_COMMITTED, an UPDATE statement will lock the currently affected rows so no other transaction can modify them, until the current transaction commits or rolls back.

If a concurrent transaction issues an UPDATE on the locked rows, the second transaction will block until the first one releases it's locks.

  1. Is this behavior trying to prevent the write-write conflicts?

  2. Lost updates can still happen in READ_COMMITTED, as after the first transaction commits, the second one will overwrite the row (even if the database has changed between the UPDATE query start and the query end). So if lost updates are still possible, why does the second transaction have to wait? Couldn't the row-level snapshots be used to store the uncommitted transaction changes to avoid transactions having to wait for write-locks to be released?

Upvotes: 7

Views: 1419

Answers (2)

Evgeny Zhahovchik
Evgeny Zhahovchik

Reputation: 11

So if lost updates are still possible, why does the second transaction have to wait? Couldn't the row-level snapshots be used to store the uncommitted transaction changes to avoid transactions having to wait for write-locks to be released?

Yes, lost updates are still possible, but to reproduce lost update situation (from non-repeatable read perspective) we need to have interaction between readers and writers from concurrent transactions. But what if we have two concurrent transactions and they both execute queries like:

UPDATE register SET total = total + 100 WHERE id = 1;

So, we have atomic operations where we have no room for interaction between reader and writer and we always expect that last transaction will try to apply its statement to the up-to-date data (and if the data changed but not committed we should wait) even in MVCC and regardless of isolation level. And as a result we get +200 from both transactions in READ COMMITTED level.

But without locking mechanism, we will end up having +100 as a result of both transactions and it will be similar to the lost update of write-write conflict that you mentioned. So writers always block writers to avoid problems like that.

Upvotes: 1

Glenn Paulley
Glenn Paulley

Reputation: 76

The answer to the first question is Yes. No DBMS can support dirty writes; if two transactions T1 and T2 are concurrently executing and T2 overwrites an update from T1, then the system cannot handle the case where T1 subsequently issues a ROLLBACK since T2's update has already occurred.

To avoid dirty writes, the original definition for snapshot isolation was "first committer wins" - that is, conflicting writes would be allowed to happen, but only the first transaction to issue a COMMIT would be able to - all other conflicting transactions would have to ROLLBACK. But this programming model is somewhat problematic, if not wasteful, since a transaction might update a significant proportion of the database only to be refused the ability to COMMIT at the end. So, instead of "first committer wins" most DBMS systems that support MVCC implement "first updater wins" using fairly traditional two-phase locking.

Upvotes: 5

Related Questions