Croco
Croco

Reputation: 356

Is it possible to do a Phantom read to a row someone just updated?

From the MySQL glossary:

phantom: A row that appears in the result set of a query, but not in the result set of an earlier query. For example, if a query is run twice within a transaction, and in the meantime, another transaction commits after inserting a new row or updating a row so that it matches the WHERE clause of the query.

Is the bolded part correct? If I have

CREATE TABLE  t1 (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `c1` varchar(45) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;

and the isolation level is REPEATABLE READ and I do

mysql> start transaction;
mysql> SELECT * FROM t1 WHERE c1 < 10;
+----+------+
| id | c1   |
+----+------+
|  1 | 4    |
+----+------+
mysql> SELECT * FROM t1 WHERE c1 < 10;
+----+------+
| id | c1   |
+----+------+
|  1 | 4    |
+----+------+

I could sometimes get different result from the later query even if no one does any INSERTs but only UPDATEs? My MySQL version is 5.7.

SQL standard indicates phantom reads are related only to concurrent INSERTs although the word generate is a bit confusing. From the ISO/IEC 9075:1992, Database Language SQL- July 30, 1992 (Second Informal Review Draft):

P3 ("Phantom"): SQL-transaction T1 reads the set of rows N that satisfy some search condition. SQL-transaction T2 then executes SQL-statements that generate one or more rows that satisfy the search condition used by SQL-transaction T1. If SQL-transaction T1 then repeats the initial read with the same search condition, it obtains a different collection of rows.

Upvotes: 2

Views: 1166

Answers (2)

Bill Karwin
Bill Karwin

Reputation: 562300

InnoDB REPEATABLE-READ transaction isolation level prevents phantom rows, but only if your SELECT query is a non-locking query.

Thus you can SELECT with the same query conditions multiple times during a transaction, and you're guaranteed to get the same result time after time, even while other sessions are inserting, updating, or deleting rows in ways that would affect your result set. As soon as you begin a new transaction, your query will see the changes to the rows that have been going on in the meantime.

But InnoDB has an odd case: if you run a locking read query like one of the following:

SELECT * FROM t1 WHERE c1 < 10 FOR UPDATE

SELECT * FROM t1 WHERE c1 < 10 LOCK IN SHARE MODE

SELECT * FROM t1 WHERE c1 < 10 FOR SHARE -- MySQL 8.0 syntax

Then a SELECT will "see" the results of concurrent changes to data, as if your transaction had been started as a READ-COMMITTED transaction.

You can even switch back and forth between locking read queries and non-locking read queries within the same REPEATABLE-READ transaction, and you'll see different result sets for each. So be aware of this if you use locking SELECT statements.

I think the word "generate" in the excerpt you showed is meant to apply to either INSERT or UPDATE. They needed a term to apply for both cases, because I guess they didn't feel like writing a more clear phrase like "insert or update."

Upvotes: 2

fideloper
fideloper

Reputation: 12293

To my knowledge it's not (shouldn't?) be possible to get a phantom row when within a transaction using REPEATABLE READ.

When creating a transaction within a repeatable-read lock, mysql creates a "snapshot" of the data once the first query in the transaction is run.

So, any select statements will get data within that snapshot.

The two caveats from the docs that I know of that can result in unexpected results:

  1. Any read queries made within a transaction will affect future SELECT statements (that one likely isn't unexpected, but could cause confusion I suppose).
  2. Other write-queries made to the database from other transactions can affect write queries made within the current transaction.

Point 2 there is the more confusing one - See the note in the docs here for more of an explanation there. This also goes into more details on the repeatable read isolation level.

It looks like InnoDB guards against phantom rows.

To prevent phantoms, InnoDB uses an algorithm called next-key locking that combines index-row locking with gap locking. InnoDB performs row-level locking in such a way that when it searches or scans a table index, it sets shared or exclusive locks on the index records it encounters. Thus, the row-level locks are actually index-record locks.

Gap locking can be disabled as discussed in Section 15.7.1, “InnoDB Locking”. This may cause phantom problems because other sessions can insert new rows into the gaps when gap locking is disabled.

I'd love to know if you have a different setting that's creating that situation in your case! Do any of the docs linked clarify that? If this is a rather "deep" mysql question, perhaps the DBA stack exchange may be a better one to try.

Upvotes: 1

Related Questions