Erik Norvelle
Erik Norvelle

Reputation: 21

PHP: Memory not being freed after using of unset() or assignment to null

I have written a PHP plugin that imports records into a database from an Endnote XML bibliography file. The import process involves several stages, one of them being the initial reading of the Endnote records into memory and creating an internal, object-based representation for them. Secondly, all records to be imported have to be scanned for whether their author, publication, keywords etc. already have corresponding records in the database.

I am running PHP 5.4.7 (64 bit) on OS X 10.8.2.

In order to accomplish these tasks in a speedy manner, I am doing almost all data storage in memory, as opposed to writing data out to a db or repeatedly consulting the database... all necessary data is read in once and consulted as needed.

This is of course memory-intensive. However, I have developed a number of strategies to reduce the memory footprint which have been quite effective in reducing the amount of memory used. In particular, I make extensive use of the native PHP serialization/unserialization facilities, together with zlib to compress the serialized representations. Still, memory usage is still uncomfortably high, with maximum memory being exhausted after importing only 500 Endnote records.

To resolve this, I have tried using the unset() internal function to deallocate all variables, arrays and objects that I no longer need, as soon as I have done with them. Some of these objects are quite large when they are instantiated, which is why I dispose of them as soon as possible. However, after doing some memory use profiling, I am finding that the memory usage reported by memory_get_usage(true) is NOT going down, despite unsetting variables, enabling garbage collection using gc_enable() and requesting periodic garbage collection via gc_collect_cycles().

I have read other posts which indicate that so long as the reference counts for a particular variable have not gone down to zero, PHP will not free the associated memory, which I understand. I have designed my code to avoid circular references... each of the distinct memory-consuming objects has an independent set of internal storage arrays, none of which share data with other objects. Hence, upon destroying the host object, theoretically, all of its private data should be freed immediately. However, I am not seeing this happening.

If anyone wants to look at my code, it is available on Github.

1. The main unit test that puts the various high-memory usage objects through their paces, and measures memory usage (processing 500 Endnote records, it uses 122MB total) is found in the file /test/ObjectStoreTest.php

2. The routine for parsing the Endnote data and turning it into an object-based representation (uses about 15MB during the run) is found in /controller/ParseAndStoreEndnoteRecordsHandler.class.php

3. The class for discovering already-existing authors in the main database (seems to use up about 30MB) is found in /model/resolvers/CreatorExternalReferenceResolver.class.php

I give this information for reference, in case it is needed for answering the question... clearly, I don't expect anybody to spend half their day analyzing my code. Hopefully, however, this information will be sufficient in order to clearly specify the particular memory usage issue I am having.

Upvotes: 1

Views: 528

Answers (1)

Mark Tomlin
Mark Tomlin

Reputation: 8943

The problem your seeing is due to the fact that that PHP's Garbage Collector has not kicked in yet, or the reference count for the memory consuming objects are not zero yet. The GC is more likely to kick in with a lower memory limit, but it looks like you need all of that memory space. I would set the memory limit 'as is' or higher and let the engine do its job.

The only true fix for this is to use to upgrade to the alpha version of PHP 5.5.0 and use Generators or Co-routines found in that build to keep the memory foot print down. It allows you to only peek at the value of the object in question and does not keep that value in RAM when it moves onto the next object. This allows for the garbage collector to do it's job as the reference counts for the objects are all zero and can thus be removed from memory.

Upvotes: 1

Related Questions