sqldatabasetransactionslost-updatewrite-skew

Reputation: 405

'Lost update' vs 'Write skew'

Can somebody explain me whats exactly the difference between a 'lost update' and a 'write skew' in database transaction theory? Can somebody give me an example?

Upvotes: 17

Answers (2)

Super Kai - Kazuya Ito

Reputation: 1

Lost update is that two transactions read the same row to update it but the first committed update is overwritten by the second committed update.
Write skew is that two transactions read the same data to insert, update or delete rows but the committed data is inconsistent.

First, this is the example of lost update below. There is product table with id, name and stock as shown below.

`product` table:

id	name	stock
1	Apple	10
2	Orange	20

These steps below shows lost update. *The stocks of the products decrease when customers buy the products:

Flow	Transaction 1 (T1)	Transaction 2 (T2)	Explanation
Step 1	`BEGIN;`		T1 starts.
Step 2		`BEGIN;`	T2 starts.
Step 3	`SELECT stock FROM product WHERE id = 2;` 20		T1 reads `20` which is updated later to `13` because a customer buys 7 oranges.
Step 4		`SELECT stock FROM product WHERE id = 2;` 20	T2 reads `20` which is updated later to `16` because a customer buys 4 oranges.
Step 5	`UPDATE product SET stock = 13 WHERE id = 2;`		T1 updates `20` to `13`.
Step 6	`COMMIT;`		T1 commits.
Step 7		`UPDATE product SET stock = 16 WHERE id = 2;`	T2 updates `13` to `16` after T1 commits.
Step 8		`COMMIT;`	T2 commits. *Lost update occurs.

Second, this is the example of write skew. There is doctor table with id, name and on_call as shown below.

`doctor` table:

id	name	on_call
1	John	True
2	Lisa	True

These steps below shows write skew. *At least one doctor must be on call but finally, no doctor is on call:

Flow	John's Transaction (JT)	Lisa's Transaction (LT)	Explanation
Step 1	`BEGIN;`		JT starts.
Step 2		`BEGIN;`	LT starts.
Step 3	*`SELECT count() FROM doctor WHERE on_call = True;`** 2		JT reads `2` so John can take a rest.
Step 4		*`SELECT count() FROM doctor WHERE on_call = True;`** 2	LT reads `2` so Lisa can take a rest.
Step 5	`UPDATE doctor SET on_call = False WHERE id = 1;`		JT updates `True` to `False` which means John takes a rest.
Step 6	`COMMIT;`		JT commits.
Step 7		`UPDATE doctor SET on_call = False WHERE id = 2;`	LT updates `True` to `False` which means Lisa takes a rest.
Step 8		`COMMIT;`	LT commits. John and Lisa both take a rest. *Write skew occurs.

Next, this is also the example of write skew. There is event table with id, name and user as shown below.

`event` table:

id	name	user
1	Make Sushi	John
2	Make Sushi	Tom

These steps below shows write skew. *Only 3 users can join the event "Make Sushi" but finally, 4 users join it:

Flow	Transaction (JT)	Transaction (LT)	Explanation
Step 1	`BEGIN;`		T1 starts.
Step 2		`BEGIN;`	T2 starts.
Step 3	*`SELECT count() FROM event WHERE name = 'Make Sushi';`** 2		T1 reads `2` so only one user can join it.
Step 4		*`SELECT count() FROM event WHERE name = 'Make Sushi';`** 2	T2 reads `2` so only one user can join it.
Step 5	`INSERT INTO event values ('Make Sushi', 'Lisa');`		T1 inserts `Lisa` to `event` table.
Step 6	`COMMIT;`		T1 commits.
Step 7		`INSERT INTO event values ('Make Sushi', 'Kai');`	T2 inserts `Kai` to `event` table.
Step 8		`COMMIT;`	T2 commits. 4 users join it. *Write skew occurs.

These tables below are my experiment results of lost update and write skew with each isolation level in MySQL and PostgreSQL. *Yes means Occurs, No means Doesn't occur and SFU means SELECT FOR UPDATE.

MySQL:

Isolation Level	Lost Update	Lost Update with SFU	Write Skew	Write Skew with SFU
READ UNCOMMITTED	Yes	No	Yes	No
READ COMMITTED	Yes	No	Yes	No
REPEATABLE READ	Yes	No	Yes	No
SERIALIZABLE	No	No	No	No

PostgreSQL:

Isolation Level	Lost Update	Lost Update with SFU	Write Skew	Write Skew with SFU
READ UNCOMMITTED	Yes	No	Yes	No
READ COMMITTED	Yes	No	Yes	No
REPEATABLE READ	No	No	Yes	No
SERIALIZABLE	No	No	No	No

Upvotes: 8

Joel Stevick

Reputation: 2038

Informally, lost updates and write skew are ways that concurrent write transactions can interfere with each other.

Write skew happens when an update is made within a transaction based upon stale data. Stale data is a value read by a transaction that has become stale due to a subsequent committed write from a concurrent transaction.

Lost updates happen when a committed value written by one transaction is overwritten by a subsequent committed write from a concurrent transaction. In fact, lost update is really a special case of write skew; where updates are applied to data that has become stale.

Consider the case where a database for a retail store maintains an Inventory table. The database does not implement transaction isolation.

The Inventory table has a "ProductId" column and an "InStock" column that counts the number of items that are currently in stock for a particular product. Each purchase (transaction) decrements the "InStock" value by the number of items purchased.

Imagine that the store has two electric shavers (of a specific model) in stock.

Two customers each purchase one of these shavers, simultaneously.

Each of the concurrent purchases (transactions) reads the same value (two) from the shaver's "InStock" record. The transactions each decrement the "InStock" counter and commit the updated value (one) to the database. After both of the concurrent transactions have committed, the counter will incorrectly indicate that the shaver is still in stock (one item remaining).

One of the updates was lost.

Suppose the database implements snapshot Isolation (with lost update detection), in this case lost updates don't happen. This is because snapshot isolation detects when a lost update has occurred. After a transaction commits data, concurrent transactions that attempt to commit writes for the same data are aborted by the database. In our example, the process for which the transaction is aborted starts a new transaction to re-read the "InStock" column, decrement it, and commit the updated value. Assuming no other conflicts, this attempt to update the record commits successfully and the "Instock" column contains the (correct) value zero.

Transaction isolation is a deep topic.

Furthermore assume that the database records inventory history in an InventoryHistory table. The InventoryHistory table has the columns "Timestamp", "ProductId", and "InStock" (remaining after purchase). By design, the update to the InventoryHistory table is the last operation in a purchase transaction. After the two transactions commit, the respective InventoryHistory records will each reflect an "Instock" value of one -- this is incorrect, since one of the records should reflect an "Instock" value of zero. The incorrect InventoryHistory record is an example of write skew.

In this case snapshot isolation did not prevent anomalous data from being written to the database, since no updates were lost. Rather, the data written was anomalous because a value that was read by the transaction had become stale -- this is write skew. Snapshot isolation does not prevent write skew. To prevent write skew, the database must implement serializable isolation.

Read this article for a rigorous discussion of write skew, serializability and snapshot isolation.

Upvotes: 15

&#39;Lost update&#39; vs &#39;Write skew&#39;

Answers (2)

product table:

*Lost update occurs.

doctor table:

John and Lisa both take a rest.

*Write skew occurs.

event table:

4 users join it.

*Write skew occurs.

Related Questions

'Lost update' vs 'Write skew'

`product` table:

`doctor` table:

`event` table: