user26676
user26676

Reputation: 278

PHP General Principles: is one big SQL call better or lots of little ones

This is an optimisation question RE: 1st principles.. Imagine I am doing a big heavy lifting comparison.. 30k files vs 30k database entries.. is it most process efficient to do one big MySQL into an array then loop through physical files checking vs the array or is it better to loop through the files and then one at a time do one line MySQL calls..

Here is some pseudo code to help explain:

 //is this faster?
 foreach($recursiveFileList as $fullpath){
     $Record = $db->queryrow("SELECT * FROM files WHERE fullpath='".$fullpath."'");
     //do some $Record logic           
 }


 //or is this faster
 $BigList = array();
 $db->query("SELECT * FROM files");
 while($Record = $db->rows()){
     $BigList[$Record['fullpath']] = $Record;
 }

 foreach($recursiveFileList as $fullpath){
     if (isset($BigList[$fullpath])){
         $Record = $BigList[$fullpath];
         //do some $Record logic
     }   
 }

Upvotes: 3

Views: 336

Answers (1)

Bill Karwin
Bill Karwin

Reputation: 562921

Update: if you always know that your $recursiveFileList is 100% of the table, then doing one query per row would be needless overhead. In that case, just use SELECT * FROM files.

I wouldn't use either of the two styles you show.

The first style runs one separate SQL query for each individual fullpath. This causes some overhead of SQL parsing, optimization, etc. Keep in mind that MySQL does not have the capability of remembering the query optimization from one invocation of a similar query to the next; it analysis and performs query optimization every time. The overhead is relatively small, but it adds up.

The second style shows fetching all rows from the table, and sorting it out in the application layer. This has a lot of overhead, because typically your $recursiveFileList might match only 1% or 0.1% or an even smaller portion of the rows in the table. I have seen cases where transferring excessive amounts of data over the network literally exhausted a 1Gbps network switch, and this put a ceiling on the requests per second for the application.

Use query conditions and indexes wisely to let the RDBMS examine and return only the matching rows.

The two styles you show are not the only options. What I would suggest is to use a range query to match multiple file fullpath values in a single query.

$sql = "SELECT * FROM files WHERE fullpath IN (" 
    . array_fill(0, count($recursiveFileList), "?") . ")";
$stmt = $pdo->prepare($sql);
$stmt->execute($recursiveFileList);
while ($row = $stmt->fetch()) {
    //do some $Record logic           
}

Note I also use a prepared query with ? parameter placeholders, and then pass the array of fullpath values separately when I call execute(). PDO is nice for this, because you can just pass an array, and the array elements get matched up to the parameter placeholders.

This also solves the risk of SQL injection in this case.

Upvotes: 1

Related Questions