Adam C.
Adam C.

Reputation: 462

race condition in mysql select sql

What I try to accomplish seems simple,

Db type: MyISAM
Table Structure: card_id, status
Query: select an unused card_id from a table, and set the row as "used".

Is it race condition that when two queries running at the same time, and before status is updated, the same card_id is fetched twice?

I did some search already. It seems Lock table is a solution, but it's overkill to me and need Lock Privilege.

Any Idea?

Thanks!

Upvotes: 2

Views: 2000

Answers (2)

spencer7593
spencer7593

Reputation: 108400

It really depends on what statements you are running.

For plain old UPDATE statements against a MyISAM table, MySQL will obtain a lock on the entire table, so there is no "race" condition between two sessions there. One session will wait until the lock is released, and then proceed with it's own update (or will wait for a specified period, and abort with a "timeout".)

BUT, if what you are asking about is two sessions both running a SELECT against a table, to retrieve an identifier for a row to be updated, and both sessions retrieving the same row identifier, and then both sessions attempting to update the same row, then yes, that's a definite possibility, and one which really does have to be considered.

If that condition is not addressed, then it's basically going to be a matter of "last update wins", the second session will (potentially) overwrite the changes made by a previous update.

If that's an untenable situation for your application, then that does need to be addressed, either with a different design, or with some mechanism that prevents the second update from overwriting the update applied by the first update.

One approach, as you mentioned, is to avoid this situation by first obtaining an exclusive lock on the table (using a LOCK TABLES statement), then running a SELECT to obtain an identifier, and then running an UPDATE to update the identified row, and then finally, releasing the lock (using an UNLOCK TABLES statement.)

That's a workable approach for some low volume, low concurrency applications. But it does have some significant drawbacks. Of primary concern is reduced concurrency, due to the exclusive locks obtained on a single resource, which has the potential to cause a performance bottleneck.

Another alternative is an strategy called "optimistic locking". (As opposed to the previously described approach, which could be described as "pessimistic locking".)

For an "optimistic locking" strategy, an additional "counter" column is added to the table. Whenever an update is applied to a row in the table, the counter for that row is incremented by one.

To make use of this "counter" column, when a query retrieves a row that will (or might) be updated later, that query also retrieves the value of the counter column.

When an UPDATE is attempted, the statement also compares the current value of the "counter" column in the row with the previously retrieved value of the counter column. (We just include a predicate (e.g. in the WHERE clause) of the UPDATE statement. For example,

UPDATE mytable
   SET counter = counter + 1
     , col = :some_new_value       
 WHERE id = :previously_fetched_row_identifier
   AND counter = :previously_fetched_row_counter

If some other session has applied an update to the row we are attempting to update (sometime between the time our session retrieved the row and before our session is attempting to do the update), then the value of the "counter" column on that row will have been changed.

The predicate on our UPDATE statement checks for that, and if the "counter" has been changed, that will cause our update to NOT be applied. We can then detect this condition (i.e. the affected rows count will be a 0 rather than a 1) and our session can take some appropriate action. ("Hey! Some other session updated a row we were intending to update!")

There are some good write-ups on how to implement an "optimistic locking" strategy.

Some ORM frameworks (e.g. Hibernate, JPA) provide support for this type of locking strategy.


Unfortunately, MySQL does NOT provide support for a RETURNING clause in an UPDATE statement, such as:

UPDATE ... 
   SET status = 'used'
 WHERE status = 'unused'
   AND ROWNUM = 1
RETURNING card_id INTO ...

Other RDBMS (e.g. Oracle) do provide that kind of functionality. With that feature of the UPDATE statement available, we can simply run the UPDATE statement to both 1) locate a row with status = 'unused', 2) change the value of status = 'used', and 3) return the card_id (or whatever columns we want) of the row the we just updated.

That gets around the problem of having to run a SELECT and then running a separate UPDATE, with the potential of some other session updating the row between our SELECT and our UPDATE.

But the RETURNING clause is not supported in MySQL. And I've not found any reliable way of emulating this type functionality from within MySQL.


This may work for you

I'm not entirely sure why I previously abandoned this approach using user variables (I mentioned above that I had played around with this. I think maybe I needed something more general, which would update more than one row and return a set of of id values. Or, maybe there was something that wasn't guaranteed about the behavior of user variables. (Then again, I only reference user variables in carefully constructed SELECT statements; I don't use user variables in DML; it may be because I don't have a guarantee of their behavior.)

Since you are interested in exactly ONE row, this sequence of three statements may work for you:

SELECT @id := NULL ;

UPDATE mytable
   SET card_id = (@id := card_id) 
     , status = 'used'
 WHERE status = 'unused'
 LIMIT 1 ;

SELECT ROW_COUNT(), @id AS updated_card_id ;

It's IMPORTANT that these three statements run in the SAME database session (i.e. keep a hold of the database session; don't let go of it and get a new one.)

First, we initialize a user variable (@id) to a value which we won't confuse with a real card_id value from the table. (A SET @id := NULL statement would work as well, without returning a result, like the SELECT statement does.)

Next, we run the UPDATE statement to 1) find one row where status = 'unused', 2) change the value of the status column to 'used', and 3) set the value of the @id user variable to the card_id value of the row we changed. (We'd want that card_id column to be integer type, not character, to avoid any possible character set translation issues.)

Next, we run a query get the number of rows changed by the previous UPDATE statement, using the ROW_COUNT() function (we are going to need to verify that this is 1 on the client side), and retrieve the value of the @id user variable, which will be the card_id value from the row that was changed.

Upvotes: 3

Adam C.
Adam C.

Reputation: 462

After I post this questions, I thought of a solution which is exactly the same as the one you mentioned at the end. I used update statement, which is "update TABLE set status ='used' where status = 'unused' limit 1", which returns the primary Id of the TABLE, and then I can use this primary ID to get cart_id. Says there are two update statements occurs at the same time, as you said, "MySQL will obtain a lock on the entire table, so there is no "race" condition between two sessions there", so this should solve my issue. But I am not sure why you said, "MySQL does NOT provide support an style statement".

Upvotes: 1

Related Questions