Oleg
Oleg

Reputation: 2821

Allowed memory size exhausted error exporting from mongodb

I try to export some documents from mongodb to .csv. For some large lists, the files would be something like 40M, I get errors about memory limit:

Fatal error: Allowed memory size of 134217728 bytes exhausted 
(tried to allocate 44992513 bytes) in
/usr/share/php/Zend/Controller/Response/Abstract.php on line 586

I wonder why this error happens. What consumes such an amount of memory? How do I avoid such error without changing memory_limit which is set 128M now.

I use something like this:

public static function exportList($listId, $state = self::SUBSCRIBED)
{
        $list = new Model_List();       
        $fieldsInfo = $list->getDescriptionsOfFields($listId);
        $headers = array(); 
        $params['list_id'] = $listId;
        $mongodbCursor = self::getCursor($params, $fieldsInfo, $headers);
        $mongodbCursor->timeout(0);
        $fp = fopen('php://output', 'w');       
        foreach ($mongodbCursor as $subscriber) {
            foreach ($fieldsInfo as $fieldInfo) {           
                $field = ($fieldInfo['constant']) ? $fieldInfo['field_tag'] : $fieldInfo['field_id'];
                if (!isset($subscriber->$field)) {
                    $row[$field] = '';
                } elseif (Model_CustomField::isMultivaluedType($fieldInfo['type'])) {
                    $row[$field] = array();     
                    foreach ($subscriber->$field as $value) {
                        $row[$field][] = $value;                        
                    }
                    $row[$field] = implode(self::MULTIVALUED_DELEMITOR, $row[$field]);
                } else {
                    $row[$field] = $subscriber->$field;
                }                               
            }               
            fputcsv($fp, $row);                                  
        }                   
}

Then in my controller I try to call it something like this:

public function exportAction()
{

    set_time_limit(300);


    $this->_helper->layout->disableLayout();
    $this->_helper->viewRenderer->setNoRender();
    $fileName = $list->list_name . '.csv';

    $this->getResponse()->setHeader('Content-Type', 'text/csv; charset=utf-8')
                        ->setHeader('Content-Disposition', 'attachment; filename="'. $fileName . '"');                                                              

    Model_Subscriber1::exportList($listId);
    echo 'Peak memory usage: ', memory_get_peak_usage()/1024, ' Memory usage: ', memory_get_usage()/1024;

}

So I'm at the end of the file where I export data. It's rather strange that for the list I export with something like 1M documents, it exports successfully and displays:

> Peak memory usage: 50034.921875 Kb Memory usage: 45902.546875 Kb

But when I try to export 1.3M documents, then after several minutes I only get in export file:

Fatal error: Allowed memory size of 134217728 bytes exhausted 
(tried to allocate 44992513 bytes) in
/usr/share/php/Zend/Controller/Response/Abstract.php on line 586.

The size of documents I export are approximately the same.

I increased memory_limit to 256M and tried to export 1.3M list, this is what it showed:

Peak memory usage: 60330.4609375Kb Memory usage: 56894.421875 Kb.

It seems very confusing to me. Isn't this data so inaccurate? Otherwise, why it causes memory exhausted error with memory_limit set to 128M?

Upvotes: 0

Views: 3187

Answers (2)

Fabien Haddadi
Fabien Haddadi

Reputation: 2080

I could reproduce this issue in a similar case of exporting a CSV file, where my system should have had enough memory, as shown by memory_get_usage(), but ended up with the same fatal error: Fatal error: Allowed memory size.

I circumvented this issue by outputting the CSV contents into a physical temporary file, that I eventually zipped, before reading it out. I wrote the file in a loop, so that each iteration wrote only a limited chunk of data, so that I never exceded the memory limit. After zipping, the compression ratio was such, that I could handle raw files of over 10 times the size I initially hit the wall at. All up, it was a success.

Hint: when creating your archive, don't unlink the archive component(s) before invoking $zip->close(), as this call seems to be the one doing the business. Otherwise you'll end up with an empty archive!

Code sample:

<?php
$zip = new ZipArchive;
if ($zip->open($full_zip_path, ZipArchive::CREATE) === TRUE) {
    $zip->addFile($full_csv_path, $csv_name);
    $zip->close();

    $Response->setHeader("Content-type", "application/zip; charset=utf-8");
    $Response->setHeader("Content-disposition", "attachment; filename=" . $zip_name);

    $Response->setBody(file_get_contents($full_zip_path));
}
else {
    var_dump(error_get_last());
    echo utf8_decode("Couldn't create zip archive '$full_zip_path'."), "\r\n";
}

unset($zip);
?>

Attention: when adding items to the zip archive, don't prepend a leading slash to the item's name if using Windows based OS.

Discussion over the original issue:

The Zend file at the line quoted is the

public function outputBody()
{
    $body = implode('', $this->_body);
    echo $body;
}

from the outputBody() method of the Zend_Controller_Response_Abstract class.

It looks like, however you do it, through echo, or print, or readfile, the output is always captured, and stuck into the response body, even if your turn the response return feature off before the dispatch.

I even tried to use the clearBody() class method, within the echo loop, with in mind that each $response->sendResponse() followed by $response->clearBody() would release memory, but it failed. The way Zend handles the sending of the response is such that I always got the memory allocation of the full size of the raw CSV file.

Yet to be determined how it would be possible to tell Zend not to "capture" the output buffer.

Upvotes: 0

spf13
spf13

Reputation: 561

While the size of the documents may be about the same, the size allocated by PHP to process them isn't directly proportional to the document size or number of documents. This is because different types require different memory allocation in PHP. You may be able to free some memory as you go, but I don't see any place where you can in your code.

The best answer is to probably just increase the memory limit.

One thing you could do is offload the processing to an external script and call that from PHP. Many languages do this sort of processing in a more memory efficient way than PHP.

I've also noticed that the memory_get_peak_usage() isn't always accurate. I would try an experiment to increase the mem_limit to say 256 and run it on the larger data set (the 1.3 million). You are likely to find that it reports below the 128 limit as well.

Upvotes: 1

Related Questions