Gimli
Gimli

Reputation: 488

PHP w/ MS SQL slow on bulk insert

I am having an issue with bulk inserting into a staging table in MSSQLSRV 2008 R2.

I am inserting a CSV with ~200,000 lines and it is taking roughly 5 minutes to complete.

I tried using both PDO and the sqlsrv driver. They both seem to yield poor performance.

Here is the code to give an idea of what I am doing (I included both SQLSRV and PDO code):

...
    try {
        //create structure table record
        foreach ($mapped_data as $k => $v) {
            $insert .= $k . ",";
            $values .= $v . ",";
        }
        $insert = substr($insert, 0, -1); //remove last ,
        $values = substr($values, 0, -1);
        $tableName = $table;
        if ($stageData) {
            $tableName = "stage_$table";
        }
        if ( $query == "" )
            $query = "INSERT INTO $tableName ($insert) VALUES ";
        $query .= "($values),";

        // Insert in blocks of 1000 lines
        if ($line % 1000 == 0) {
            $log->logInfo("Executing @ line: $line");
            $query = substr($query, 0, -1); //remove last ,
            $query .= ";";

            // ======================
            // = SQLSRV DRIVER CODE =
            // ======================
            sqlsrv_begin_transaction($sqlsrvConn);
            $queryResult = sqlsrv_query($sqlsrvConn,$query);
            if ($queryResult) {
                sqlsrv_commit($sqlsrvConn);
            } else {
                sqlsrv_rollback($sqlsrvConn);
            }

            // ===================
            // = PDO DRIVER CODE =
            // ===================
            $conn->beginTransaction();
            $res = $conn->prepare($query);
            if($res->execute() === false) {
                $errInfo = $res->errorInfo();
                if ( $conn->inTransaction() ) {
                    $conn->rollback();
                }
                $log->logInfo('Data importing error on line: ' . $line . $errInfo[2]);
                $errors[] = 'Data importing error on line: ' . $line . $errInfo[2];
            } else {
                if ( $conn->inTransaction() ) {
                    $conn->commit();
                    $query = "";
                    $importedRows += ($line - 6) - $importedRows;
                }
            }
        }
    }
    catch (PDOException $e) {
        if ( $conn->inTransaction() ) {
            $conn->rollBack();
        }
        $log->logInfo('PDO Exception: ' . $e->getMessage());
        $errors[] = 'PDO Exception: ' . $e->getMessage();
    }
    $line++;
} // End of while loop through each CSV Line

fclose($handle);
$totalRows = $line - 6;
$importedRows += $totalRows - $importedRows;

// Insert remaing queries afterwards...
...

I've been scouring the internet looking for possible solutions but haven't been able to find anything that worked.

I've found this post which basically says to batch rows together (which I've already done).

And I found another post that said for PDO, to set connectionpooling=0. I tried that and didn't see any increase in performance.

Has anyone else run into this issue with SQLSRV and PHP?

Cheers,

Upvotes: 4

Views: 1995

Answers (1)

miyasudokoro
miyasudokoro

Reputation: 1745

I had a somewhat similar issue. Because my problem was a lack of available memory, my server kept having to take extra time to deal with handling virtual memory. If that's not your problem, my answer will not be useful to you.

You are using string concatenation followed by substr to remove the last comma. When you use substr, it makes another copy of the string, which is memory-intensive for long strings. See this question for an example of what happens when the strings are long. When I switched to array concatenation, my speed greatly increased due to the lower memory usage. However, if you don't have memory issues, array concatenation could actually be slower for you.

A couple of other things I saw were that you only need to collect the $inserts variable once, and you are not unsetting big variables as soon as you no longer need them. I don't know if correcting for that kind of thing will make an appreciable difference for you or not. Here are the basic kinds of changes you could try:

    if(!isset($insert)) {
        $insert = array();
        $collect = true;
    }
    $values = $array();
    foreach ($mapped_data as $k => $v) {
        if(isset($collect)) 
            $insert[] = $k;
        $values[] = $v;
    }
    unset($collect);

    .....
    if(!isset($queryend)) 
         $queryend = array();
    $queryend[] = "(".implode(",",$values).")";

    .....
    $query = "INSERT INTO $tableName ("
             .implode(",",$insert)
             .") VALUES "
             .implode(",", $queryend);
    unset($queryend);  //always unset big things as soon as possible

    .....  //after $res = $conn->prepare($query);
    unset($query);

Upvotes: 1

Related Questions