zoltar
zoltar

Reputation: 736

MySQL query on cron overlaps, ignores 'locked' rows

I'm trying to lock a row in a table as being "in use" so that I don't process the data twice when my cron runs every minute. Because of the length of time it takes for my script to run, the cron will cause multiple instances of the script to run at once (usually around 5 or 6 at a time). For some reason, my "in use" method is not always working.

I do not want to LOCK the tables because I need them available for simultaneous processing, that is why I went the route of pseudo-locking individual rows with an 'inuse' field. I don't know of a better way to do this.

Here is an illustration of my dilemma:

 <?
    //get the first row from table_1 that is not in use
    $result = mysqli_query($connect,"SELECT * FROM `table_1` WHERE inuse='no'");
    $rows = mysqli_fetch_array($result, MYSQLI_ASSOC);
    $data1 = $rows[field1];

    //"lock" our row by setting inuse='yes'
    mysqli_query($connect,"UPDATE `table_1` SET inuse='yes' WHERE field1 = '$data1'");

    //insert new row into table_2 with our data if it doesn't already exist
    $result2 = mysqli_query($connect,"SELECT * FROM `table_2` WHERE field='$data2'");
    $numrows = mysqli_num_rows($result2);
    if($numrows >= 1) { 
      //do nothing
    } else {
      //run some unrelated script to get data
      $data2 = unrelatedFunction();

      //insert our data into table_2
      mysqli_query($connect,"INSERT INTO `table_2` (field) value ('$data2')");
    }

    //"unlock" our row in table_1
    mysqli_query($connect,"UPDATE `table_1` SET inuse='no' WHERE field1 = '$data1'");
 ?>

You'll see here that $data2 won't be collected and inserted if a row already exists with $data2, but that part is for error-checking and does not answer my question as the error still occurs. I'm trying to understand why (if I don't have that error-check in there) my 'inuse' method is sometimes being ignored and I'm getting duplicate rows in table_2 with $data2 in them.

Upvotes: 0

Views: 416

Answers (1)

viraptor
viraptor

Reputation: 34195

There's a lot of time in between your first select and the first update where another process can do the same operation. You're not using transaction either, so you're not guaranteeing any order of the changes becoming visible to others.

You can either move everything into a transaction with the isolation level you need and use SELECT .... FOR UPDATE syntax. Or you can try doing the copy in a different way. For example update N rows that you want to process and SET in_use=your_current_pid WHERE in_use IS NULL. Then you can read back the rows you manually marked for processing. After you finish, reset in_use to NULL.

Upvotes: 1

Related Questions