Michael Vigato
Michael Vigato

Reputation: 336

How can this SQL command cause phantom read?

In short, my professor said the following transactions are susceptible to phantom read if they're both left to default isolation level (READ COMMITTED)

BEGIN;
UPDATE product SET price = 100 WHERE type='Electronics';
COMMIT;


BEGIN;
UPDATE product SET price = 100 WHERE price < 100;
COMMIT;

I can't really seem to be able to figure how a phantom read could happen.

He also said that, to fix this, you'd have to set the second transaction to REPEATABLE READ

So... why? How could a phantom read happen here, and why does REPEATABLE READ fixes it?



EDIT: could this be the case?
Say we have an initial product P that has type=Electronics AND price=1049

T1 would begin, and add P to the set of rows to consider.

T2 would begin, and ignore P (its price is below 1050).

T1 would increment its price to 1100, and COMMITs. Now T2 should update its rows set and include P.

But since in READ COMMITTED a transaction will get an updated snapshot only if changes are made to rows that were within the SET they are considering, the change goes unnotified.

T2, therefore, simply ignores P, and COMMITs.

This scenario was suggested by the example I found in postgresql docs, on the isolation level page, read committed paragraph.

Do you think this is a possible scenario and, hopefully, what my professor meant?

Upvotes: 0

Views: 738

Answers (2)

Laurenz Albe
Laurenz Albe

Reputation: 246403

A phantom read means that if you run the same SELECT twice in a transaction, the second one could get different results than the first.

In the words of the SQL standard:

SQL-transaction T1 reads the set of rows N that satisfy some <search condition>.
SQL-transaction T2 then executes SQL-statements that generate one or more rows that satisfy the <search condition> used by SQL-transaction T1.
If SQL-transaction T1 then repeats the initial read with the same <search condition>, it obtains a different collection of rows.

This can be caused by concurrent data modifications like the ones you quote at low isolation levels. This is because each query will see all committed data, even if they were committed after the transaction started.

You could also speak of phantom reads in the context of an UPDATE statement, since it also reads from the table. Then the same UPDATE can affect different statements if it is run twice.

However, it makes no sense to speak of phantom reads in the context of the two statements in your question: The second one modifies the column it is searching for, so the second execution will read different rows, no matter if there are concurrent data modifications or not.

Note: The SQL standard does not require that REPEATABLE READ transactions prevent phantom reads — this is only guaranteed with SERIALIZABLE isolation.

In PostgreSQL phantom reads are already impossible at REPEATABLE READ isolation, because it uses snapshots that guarantee a stable view of the database.

Upvotes: 1

VAI Jason
VAI Jason

Reputation: 544

This might help

https://en.wikipedia.org/wiki/Isolation_(database_systems)

Non-repeatable reads A non-repeatable read occurs, when during the course of a transaction, a row is retrieved twice and the values within the row differ between reads.

Non-repeatable reads phenomenon may occur in a lock-based concurrency control method when read locks are not acquired when performing a SELECT, or when the acquired locks on affected rows are released as soon as the SELECT operation is performed. Under the multiversion concurrency control method, non-repeatable reads may occur when the requirement that a transaction affected by a commit conflict must roll back is relaxed.

A phantom read occurs when, in the course of a transaction, new rows are added or removed by another transaction to the records being read.

This can occur when range locks are not acquired on performing a SELECT ... WHERE operation. The phantom reads anomaly is a special case of Non-repeatable reads when Transaction 1 repeats a ranged SELECT ... WHERE query and, between both operations, Transaction 2 creates (i.e. INSERT) new rows (in the target table) which fulfil that WHERE clause.

Upvotes: 0

Related Questions