Kasia Gogolek
Kasia Gogolek

Reputation: 3414

PHPExcel memory issue

I'm trying to loop through a 3mb Excel document, to get all the data I will then have to insert into the database. The worksheet I'm using has got 6500 rows, but it might vary in the future. I've noticed that even though I'm using recommended memory saving techniques, it still trips over

$reader = PHPExcel_IOFactory::createReaderForFile($file_path);
$reader->setReadDataOnly(true);

//$sheets = $this->getWorksheetNames($file['tmp_name'], 0);
$reader->setLoadSheetsOnly('spreadsheetname');

$chunkFilter = new IPO_Reader(); 
$reader->setReadFilter($chunkFilter); 

$highestRow    = 10000; //$this->objWorksheet->getHighestRow();
$chunkSize     = 1; 
$highestColumn = "Y";

for ($startRow = 2; $startRow <= $highestRow; $startRow += $chunkSize) 
{ 

    $chunkFilter->setRows($startRow, $chunkSize); 
    $objPHPExcel  = $reader->load($file_path); 

    for($row = $startRow ; $row <= $startRow + $chunkSize; $row++)
    {
        $this->read_row = $objPHPExcel->getActiveSheet()->rangeToArray('A'.$row.':'.$highestColumn.$row, null, true, true, true);

        $this->read_row = end($this->read_row);         

        foreach($this->read_row as $column => $value)
        {
            $db_column_name = $this->_getDbColumnMap($column);
            if(!empty($db_column_name))
            {
                $this->new_data_row[$db_column_name] = $this->_getRowData($value, $column);
            }   

        }

        $this->read_row = null;
        $this->new_data_row['date_uploaded']    = date("Y-m-d H:i:s");
        $this->new_data_row['source_file_name'] = $file_name;
        $ipo_row  = new Model_UploadData_IPO();
        $ipo_row->create($this->new_data_row);
        $this->new_data_row = null;
        unset($ipo_row);

        gc_collect_cycles();

    }
    $objPHPExcel->disconnectWorksheets(); 
    unset($objPHPExcel);    
    gc_collect_cycles();

when I test the memory usage before I unset the objPHPExcel and after, there is no memory gain, I'm really confused about it, as the split into chunks does not seem to allow me to clear the memory after each chunk, and the usage gradually rises, and with a limit set to 250MB, it only allows me to add ~500 records

Upvotes: 8

Views: 10257

Answers (2)

Morg.
Morg.

Reputation: 701

Ok, everyone knows trwtf is Excel, so may I ask if it's possible for you to convert this to CSV ?

I have my own CSV to table functions in PHP which have been used to import very large files, CSV tends to be much lighter to process and also much less prone to random library issues.

If you indeed need this for a one-time process or can go from XLS to CSV quite easily, please do so as it will make your life much easier (as everytime you stick to simpler, more standard alternatives ;) ).

And so for an API that will translate the oh-so-evil and dreadful XLS format, you can use one of the following o/s converters - I'd recommend python every time but hey, your choice :

http://www.oooninja.com/2008/02/batch-command-line-file-conversion-with.html

http://code.google.com/p/jodconverter/wiki/FAQ

Basically the idea is the same, you use an external tool in order to get a usable file format, and then you go from there.

I don't think I have my csvtotable.php script here, but it's quite easy to replicate, you just need to have a few basic tools like csvtoarray and then arraytoinsertstatements.

GL ;)

Upvotes: 0

Tomas
Tomas

Reputation: 59475

The PHP excel library is known to have these memory issues, I had also problem with that. What worked for me was this advice (from the above link, try it, there are good advices how to reduce memory usage):

$objReader = new PHPExcel_Reader_Excel5();
$objReader->setReadDataOnly(true); /* this */

But anyway the memory requirements are big, because they allocate a lot of memory for each cell (for formatting etc., even if one doesn't need that). I'm afraid we are helpless until they release new version of the library.

Upvotes: 3

Related Questions