I'll-Be-Back
I'll-Be-Back

Reputation: 10838

Improve `Update` performance (rows locking issue)

I am running 30 scripts (PHP CLI) on Linux , each script are updating (loop) the data in the MySQL database.

When I typed 'mysqladmin proc' in terminal, I can see a lot of rows has been locked for 10-30 seconds. Mostly are Update queues. How to improve the performance faster? I am using InnoDB engine.

PHP Script look something like this:

//status and process are indexed.
$SQL = "SELECT * FROM data WHERE status = 0 AND process = '1'";
$query = $db->prepare($SQL);
$query->execute();

//about 100,000+ rows for each script
while ($row = $query->fetch(PDO::FETCH_ASSOC)) {
        checking($row);
        sleep(2);
}

function checking($data) {

  $error = errorCheck($data['number']);

  if ($error) {
     //number indexed
     $SQLUpdate = "UPDATE data SET status = 2, error='$error' WHERE number = " . $data['number'];
     $update = $db->prepare($SQLUpdate);
     $update->execute();
     return false
   }


     //good?
     $SQLUpdate = "UPDATE data SET status = 1 WHERE number = " . $data['number'];
     $update = $db->prepare($SQLUpdate);
     $update->execute();


    $SQLInsert = "INSERT INTO tbl_done .....";
    $SQLInsert = $db->prepare($SQLInsert);
    $SQLInsert->execute();
}

top command:

top - 10:48:54 up 17 days, 10:30,  2 users,  load average: 1.06, 1.05, 1.01
Tasks: 188 total,   1 running, 187 sleeping,   0 stopped,   0 zombie
Cpu(s): 25.8%us,  0.1%sy,  0.0%ni, 74.1%id,  0.0%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:   4138464k total,  1908724k used,  2229740k free,   316224k buffers
Swap:  2096440k total,       16k used,  2096424k free,   592384k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
32183 mysql     15   0  903m 459m 4800 S 101.8 11.4 876:53.66 mysqld

-

/etc/my.cnf
[mysqld]
set-variable = max_connections=500
safe-show-database
max_user_connections=200
key_buffer_size = 16M
query_cache_size = 350M
tmp_table_size = 200M
max_heap_table_size  = 200M
thread_cache_size = 4
table_cache = 800
thread_concurrency = 8
innodb_buffer_pool_size = 400M
innodb_log_file_size = 128M
query_cache_limit = 500M
innodb_flush_log_at_trx_commit = 2

Server Spec: Intel Core 2 Quad Q8300, 2.5 GHz, 4GB ram.

'mysqladmin proc':

+------+-----------------+-----------+----------------+---------+------+----------+-------------------------------------------------------------------------------
| Id   | User            | Host      | db             | Command | Time | State    | Info                                                                          
+------+-----------------+-----------+----------------+---------+------+----------+--------------------------------------------------------------------------------
|  265 | user            | localhost | xxxxxxxxxxxxxx | Query   |   15 | Updating | UPDATE data SET status = '2', error = 'Unknown error'  WHERE number= 0xxxxx    
|  269 | user            | localhost | xxxxxxxxxxxxxx | Query   |   17 | Updating | UPDATE data SET status = '2', error = 'Invalid ....'  WHERE number= 0xxx 
|  280 | user            | localhost | xxxxxxxxxxxxxx | Query   |    7 | Updating | UPDATE data SET status = 1  WHERE f = 0xxxx                                           
|  300 | user            | localhost | xxxxxxxxxxxxxx | Query   |    1 | Updating | UPDATE data SET status = '2', error = 'Unknown ....'  WHERE number= 0xx             
|  314 | user            | localhost | xxxxxxxxxxxxxx | Query   |   13 | Updating | UPDATE data SET status = '2', error = 'Invalid....'  WHERE number= 0xxxx
|  327 | user            | localhost | xxxxxxxxxxxxxx | Query   |   11 | Updating | UPDATE data SET status = '2', error = 'Unknown ....'  WHERE number= 0xxxx               
|  341 | user            | localhost | xxxxxxxxxxxxxx | Sleep   |    2 |          | NULL                                                                                      
|  350 | user            | localhost | xxxxxxxxxxxxxx | Query   |    7 | Updating | UPDATE data SET status = '2', error = 'Unknown ....'  WHERE number= 0xxx                
|  360 | user            | localhost | xxxxxxxxxxxxxx | Query   |    5 | Updating | UPDATE data SET status = 1  WHERE number = 0xxxx     

Explain:

+----+-------------+-------+-------------+----------------+----------------+---------+------+-------+----------------------------------------------+
| id | select_type | table | type        | possible_keys  | key            | key_len | ref  | rows  | Extra                                        |
+----+-------------+-------+-------------+----------------+----------------+---------+------+-------+----------------------------------------------+
|  1 | SIMPLE      | data  | index_merge | process,status | process,status | 52,1    | NULL | 16439 | Using intersect(process,status); Using where |
+----+-------------+-------+-------------+----------------+----------------+---------+------+-------+----------------------------------------------+

Upvotes: 1

Views: 359

Answers (2)

melihcelik
melihcelik

Reputation: 4599

When you execute the select query, you are acquiring read locks on the rows being read. And within the check method, you are trying to update the currently being read (=locked) row. Thus, MySQL queues the update query to be executed as soon as read lock is released by the select query. But since you suspended the execution for two seconds with every row, you are increasing the delay for the lock to be released, which in turn delays every query waiting on the update queue. You can read more about innodb lock modes.

I would suggest modifying the code as such:

  • limit your select query to return only a limited amount of rows and make sure it will select the remaining rows during the next iteration. You can achieve this by using offset and limit statements in your select query.
  • Read all the rows from the select query into an array of variables and release the query so that read-locks will be released as well
  • Iterate over your array of numbers, update every row
  • Continue with the first step from where you left off.

UPDATE

You are using fetch to retrieve the rows from result set. According to the documentation:

Fetches a row from a result set associated with a PDOStatement object

In order to retrieve all the rows at once, you should use fetchAll but be careful of performance problems as the documentation states:

Using this method to fetch large result sets will result in a heavy demand on system and possibly network resources.

That's why I suggested limiting the query to retrieve certain amount of rows, instead of entire result set (consisting of 100.000+ rows). You can limit the number of returned rows by modifying your query like:

SELECT * FROM data WHERE status = 0 AND process = '1' LIMIT 10000 OFFSET 0

And then when you run the query a second time, running the query like:

SELECT * FROM data WHERE status = 0 AND process = '1' LIMIT 10000 OFFSET 10000

You can continue like this until no results have been returned.

Upvotes: 2

Eugen Rieck
Eugen Rieck

Reputation: 65342

  1. You update the rows with errors to status=2, then all rows to status=1 - assuming this is a typo (missing else)

  2. If you really sleep 2 seconds between each row, it would be wiser to select with "limit 1" and rerun the select query every 2 seconds - this could account for your locks, if number is not unique

Upvotes: 0

Related Questions