PriestVallon
PriestVallon

Reputation: 1518

PHPExcel Memory Usage

I have the following code

<?php

ini_set('memory_limit','1600M');
ini_set('max_execution_time', 3000);

require("phpexcel/Classes/PHPExcel.php");


$inputFileName = 'testa.xlsx';

$inputFileType = PHPExcel_IOFactory::identify($inputFileName);

function convert($size)
{
    $unit=array('b','kb','mb','gb','tb','pb');
    return @round($size/pow(1024,($i=floor(log($size,1024)))),2).' '.$unit[$i];
}

/**  Define a Read Filter class implementing PHPExcel_Reader_IReadFilter  */
class chunkReadFilter implements PHPExcel_Reader_IReadFilter
{
    private $_startRow = 0;

    private $_endRow = 0;

    /**  Set the list of rows that we want to read  */
    public function setRows($startRow, $chunkSize) {
        $this->_startRow    = $startRow;
        $this->_endRow        = $startRow + $chunkSize;
    }

    public function readCell($column, $row, $worksheetName = '') {
        //  Only read the heading row, and the rows that are configured in     $this->_startRow and $this->_endRow
        if (($row == 1) || ($row >= $this->_startRow && $row < $this->_endRow)){
            return true;
        }
    return false;
    }
}

/**  Create a new Reader of the type defined in $inputFileType  **/
$objReader = PHPExcel_IOFactory::createReader($inputFileType);



echo '<hr />';


/**  Define how many rows we want to read for each "chunk"  **/
$chunkSize = 25;
/**  Create a new Instance of our Read Filter  **/
$chunkFilter = new chunkReadFilter();

/**  Tell the Reader that we want to use the Read Filter that we've Instantiated  **/
$objReader->setReadFilter($chunkFilter);

/**  Loop to read our worksheet in "chunk size" blocks  **/
/**  $startRow is set to 2 initially because we always read the headings in row     #1  **/

for ($startRow = 2; $startRow <= 100; $startRow += $chunkSize) {

    /**  Tell the Read Filter, the limits on which rows we want to read this     iteration  **/
    $chunkFilter->setRows($startRow,$chunkSize);
    /**  Load only the rows that match our filter from $inputFileName to a PHPExcel Object  **/
    $objPHPExcel = $objReader->load($inputFileName);

    //    Do some processing here

    $sheetData = $objPHPExcel->getActiveSheet();
    $highestRow = $sheetData->getHighestRow();
    //$sheetData = $sheetData->toArray(null,true,true,true);
    //var_dump($sheetData);
    echo '<br /><br />';
    echo convert(memory_get_peak_usage(true));
}
?>

and when run it outputs this response.

277 mb
294.5 mb
295.5 mb
296.75 mb

It reads 25 lines at a time and so on throughout the file. What I can't figure out is, why does the memory peak keep rising?

I know that a whole Excel file has to be read before it can processed but surely that should use the same amount of memory every time and therefore the memory usage shouldn't change very much over time. However it appears to be constantly on the rise and what I cannot figure out is why.

Upvotes: 2

Views: 3914

Answers (3)

Scott Robertson
Scott Robertson

Reputation: 51

I was running into a similar issue and I believe I've tracked it down to the PHPExcel_Calculation class of the PHPExcel library. In my testing, I'm seeing that its $_workbookSets array is never emptied out and continues to have more instances added to it with each chunk iteration.

Unfortunately I haven't been able to track down the exact cause, but it seems that the unsetInstance() method is only called at the very end of script execution, when the PHPExcel class destructor is called.

Calling the disconnectWorksheets() method didn't have any effect in fixing this, nor did forcing PHP's garbage collection via gc_collect_cycles().

My temporary solution has been to add a new unsetInstances() static method to the Calculation class, which sets $_workbookSets to an empty array, and then call that method at the end of my chunk loop.

In Calculation.php of the PHPExcel library:

public static function unsetInstances() {
  self::$_workbookSets = array();
}

And then call the function as the last line in your loop:

PHPExcel_Calculation::unsetInstances();

Upvotes: 0

pancy1
pancy1

Reputation: 511

There are plenty of measures you can take to reserve less memory when working with PHPExcel. I recommend you to take the following actions to optimize memory usage before modifying your server's memory limit in Apache.

/* Use the setReadDataOnly(true);*/
    $objReader->setReadDataOnly(true);

/*Load only Specific Sheets*/
    $objReader->setLoadSheetsOnly( array("1", "6", "6-1", "6-2", "6-3", "6-4", "6-5", "6-6", "6-7", "6-8") );

/*Free memory when you are done with a file*/
$objPHPExcel->disconnectWorksheets();
   unset($objPHPExcel);

Avoid using very large Exel files, remember it is the file size that makes the process run slowly and crash.

Avoid using the getCalculatedValue(); function when reading cells.

Upvotes: 3

Adrien
Adrien

Reputation: 1947

Even though you are reading data by chunks, PHPExcel holds an in memory representation of a spreadsheet. The more data you read, the more memory you will need.

Holding the representation in memory is useful to be able to add/edit cells anywhere in your spreadsheet as well as doing some calculations on rows/columns (for instance, to adjust the width of a column, you need to know the width of every single non-empty cell in that column and having all that data in memory makes it easier to retrieve).

In general, every cell you read will take 1K of memory. You can optimize this by using the different caching mechanisms that PHPExcel provides. Although memory optimization comes with a performance penalty, so it's a tradeoff.

Upvotes: 0

Related Questions