njz2104
njz2104

Reputation: 63

Serializable Isolation Level Confusion - Write Skew (Postgres)

I'm running Postgres12 and confused about the behavior of the serializable transaction level.

Tables:

Events

  1. id
  2. difficulty

Managers

  1. id
  2. level

Intended behavior (within serialized transaction):

  1. check if there are 7 or more events of difficulty=2
  2. if so, insert a manager with level=2

I'm running the following transactions in serializable but not seeing the behavior I am expected (expected the serializable transaction to detect write skew between 2 sessions)

-- session 1:
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE
SELECT count(*) from events WHERE difficulty=2
-- RETURNS 7
-- now start session 2
-- session 2:
BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE
SELECT id FROM events WHERE difficulty=2 LIMIT 1;
/* 
 id 
----
  4
*/
 UPDATE events SET difficulty=1 WHERE id=4;
COMMIT;

now there are only 6 events of difficulty=2

-- back in session 1
-- since we have counted 7 events of difficulty=2 in this session, create a manager
INSERT INTO manager (level) VALUES (2);
COMMIT;
-- Expected write skew to be detected here bc the read event rows have seen updates (only 6 actually)

Unfortunately, our final state is now 6 events of difficulty=2 and a manager of level 2. Why didn't serializable isolation prevent this write skew? What am I misunderstanding about serializable isolation use case? Why are events with difficulty=2 not locked or watched by predicate locking or some isolation mechanism?

Picture for clarity enter image description here

Upvotes: 4

Views: 2689

Answers (3)

Write skew doesn't occur in SERIALIZABLE in PostgreSQL:

Isolation Level Write Skew
READ UNCOMMITTED Yes
READ COMMITTED Yes
REPEATABLE READ Yes
SERIALIZABLE No

I experimented if write skew occurs in SERIALIZABLE in PostgreSQL. There is event table with name and user as shown below.

event table:

name user
Make Sushi John
Make Sushi Tom

Then, I took these steps below for the experiment of write skew. *Only 3 users can join the event "Make Sushi":

Flow Transaction (T1) Transaction (T2) Explanation
Step 1 BEGIN; T1 starts.
Step 2 BEGIN; T2 starts.
Step 3 SELECT count(*) FROM event WHERE name = 'Make Sushi';

2
T1 reads 2 so only one user can join it.
Step 4 SELECT count(*) FROM event WHERE name = 'Make Sushi';

2
T2 reads 2 so only one user can join it.
Step 5 INSERT INTO event values ('Make Sushi', 'Lisa'); T1 inserts Lisa to event table.
Step 6 COMMIT; T1 commits.
Step 7 INSERT INTO event values ('Make Sushi', 'Kai');

ERROR: could not serialize access due to read/write dependencies among transactions
T2 cannot insert Kai to event table and gets error.
Step 8 COMMIT; T2 rollbacks with COMMIT query.

*Write skew doesn't occur.

Upvotes: 0

Laurenz Albe
Laurenz Albe

Reputation: 247625

SERIALIZABLE means that there is a way to execute the transactions serially (one after the other) so that the effect is the same. In your case, this equivalent serial execution would run session 1 first, then session 2, with the same effect.

You could say that session 1 executes "logically" before session 2.

Upvotes: 2

njz2104
njz2104

Reputation: 63

Answering my own question after some thinking!

The serialization check is not preventing the two sessions from committing because it is possible to serialize the two transactions and still end up with a level 2 manager and 6 events of difficulty=2.

E.g.

  1. Run session 1 (check if 7 events with difficulty=2, create manager level=2) COMMIT;
  2. Run session 2 (remove one event, now 6 events with difficulty=2) COMMIT;

^Output = 6 events, 1 manager

This is the same result as running concurrently so this is deemed an "acceptable" state for these two serializable transactions.

if you want to prevent this behaviour, session 2 can be updated to the following

begin transaction isolation level serializable;
select count(*) from manager where level=2; 
--if no managers
update events set difficulty=1 where id=4;

Now there is no logical way to end up with the state 6 events 1 manager with a serialized ordering. The two possible outcomes from a sequential ordering are

  1. session 1 runs
  2. session 2 runs

^output = 7 events 1 managers

  1. session 2 runs
  2. session 1 runs

^output = 6 events 0 managers

So in this case (with the updated session 2), one of your transactions would be blocked because the transactions are no longer serializable.

Upvotes: 0

Related Questions