quentinadam
quentinadam

Reputation: 3150

INSERT ... SELECT, InnoDB and locking

I came accross the following behaviour with InnoDB engine under MySQL 5.5.34 (on Ubuntu 12.04).

When performing INSERT ... SELECT statements, some unexpected rows seem to be locked in the table being read from.

Let me give an example. Suppose two tables table_source and table_dest with the following structure (particular attention to the indices):

CREATE TABLE table_source (
  id int(11) unsigned NOT NULL AUTO_INCREMENT,
  group_id int(11) NOT NULL,
  data text NOT NULL,
  created timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (id),
  KEY group_id_created (group_id,created)
) ENGINE=InnoDB AUTO_INCREMENT=8 DEFAULT CHARSET=utf8;

CREATE TABLE table_dest (
  id int(11) unsigned NOT NULL AUTO_INCREMENT,
  group_id int(11) NOT NULL,
  data text NOT NULL,
  created timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (id),
  KEY group_id_created (group_id,created)
) ENGINE=InnoDB AUTO_INCREMENT=8 DEFAULT CHARSET=utf8;

Suppose I now execute the following transaction:

BEGIN;
INSERT INTO table_dest 
    SELECT * FROM table_source WHERE group_id = 3 AND created < '2014-01-04';
....

Then the source table seems to be locked for INSERT with group_id 2:

INSERT INTO table_source (group_id, data, created) 
    VALUES (2, 'data', NOW()); --< This locks

Here are some other statements and if they lock or not:

INSERT INTO table_source (group_id, data, created) 
    VALUES (3, 'data', NOW()); --< Does not lock

INSERT INTO table_source (group_id, data, created) 
    VALUES (1, 'data', NOW()); --< Does not lock

INSERT INTO table_source (group_id, data, created) 
    VALUES (3, 'data', '2014-01-01'); --< Does lock

Can somebody explain me why this happens (I suppose it has something to do with gap locks)? Is there a way to avoid this (I still want to keep the REPEATABLE READ isolation level)?

Upvotes: 2

Views: 3824

Answers (1)

jeremycole
jeremycole

Reputation: 2761

That's correct. The rows in the table being read from are locked with a shared lock (the SELECT is implicitly LOCK IN SHARE MODE). There isn't a way to avoid this. It's sort of what you're asking the system for: copy all rows which match a condition. The only way to ensure that is in fact all rows that match the condition and that that list does not change during or immediately after the execution of that statement, is to lock the rows.

As a clarification regarding why you are unable to INSERT with group_id = 2:

This has to do with your query being specifically WHERE group_id = 3 AND created < '2014-01-04' on KEY group_id_created (group_id, created). In order to search all rows which match group_id = 3 AND created < '2014-01-04' the index will be traversed backwards starting with the first row which exceeds that condition the upper bound, which is (3, '2014-01-14') and continuing until finding a row which does not match the condition, which since created has no lower bound will be the first row where group_id < 3 which of course is group_id = 2.

That means that the first row encountered with group_id = 2 is also locked, which will be the row with the maximum created value. This will make it impossible to INSERT into the "gap" between (2, MAX(created)) and (3, MIN(created)) (not proper SQL of course, just pseudo-SQL), although this is not a "gap lock" specifically.

Upvotes: 4

Related Questions