Reputation: 231
I realize the knee-jerk response to this question is that "you dont.", but hear me out.
Basically I am running on an active-record system on a SQL, and in order to prevent duplicate objects for the same database row I keep an 'array' in the factory with each currently loaded object (using an autoincrement 'id' as the key).
The problem is that when I try to process 90,000+ rows through this system on the odd occasion, PHP hits memory issues. This would very easily be solved by running a garbage collect every few hundred rows, but unfortunately since the factory stores a copy of each object - PHP's garbage collection won't free any of these nodes.
The only solution I can think of, is to check if the reference count of the objects stored in the factory is equal to one (i.e. nothing is referencing that class), and if so free them. This would solve my issue, however PHP doesn't have a reference count method? (besides debug_zval_dump, but thats barely usable).
Upvotes: 17
Views: 5463
Reputation: 2512
I had a similar problem with the Incredibly Flexible Data Storage (IFDS) file format with trying to keep track of references to objects in an in-memory data cache. How I solved it was to create a ref-counting class that wrapped a reference to the underlying array. I generally prefer arrays over objects as PHP has traditionally tended to handle arrays better than objects with regards to unfortunate things like memory leaks.
class IFDS_RefCountObj
{
public $data;
public function __construct(&$data)
{
$this->data = &$data;
$this->data["refs"]++;
}
public function __destruct()
{
$this->data["refs"]--;
}
}
Since 'refs' is tracked as a regular value in the data, it is possible to know when the last reference to the data has gone away. Regardless of whether multiple variables reference the refcounting object or it is cloned, the refcount will always be non-zero until all references are gone. I don't need to care how many actual references there are internally in PHP as long as the value is correctly zero vs. non-zero. The IFDS implementation also tracks an estimated amount of RAM being used by each object (again, being exact isn't super important as long as it is in the ballpark), allowing it to prioritize writing and releasing unused objects that are occupying system resources first and then writing and releasing portions of still-referenced objects that are caching large quantities of DATA chunk information.
To get back to the topic/question, with this ref-counting class-based approach, it is, for example, mostly straightforward to prune to ~5,000 records in a cache upon hitting 10,000 records in the cache. General strategy is to not get rid of records still being referenced plus keep the most recently requested/used records that aren't being referenced because they are likely to be referenced again. Upon every new reference, unset()
and then setting the item again will move the item to the end of the array so that the oldest probably unreferenced items appear first and the newest probably still referenced items appear last.
Weak references, as several people have suggested, won't solve every caching issue. They don't work in caching scenarios where you don't want to remove an item from the cache until the application is done working with it (i.e. deleting an item that the application later attempts to use) but also want to keep it around as long as RAM overhead permits even if the application stops referencing it temporarily but might need it again in a moment. Weak references are also incapable of working in scenarios where the item in the cache is holding onto unwritten data that may or may not be fine with staying unwritten even if there are no references to it in the application. In short, when there is a balancing act to maintain, weak references cannot be used.
Upvotes: 0
Reputation: 661
PHP 7.4 now has WeakReference
To know if $obj
is referenced by something else or not, you could use:
// 1: create a weak reference to the object
$wr = WeakReference::create($obj);
// 2: unset our reference
unset($obj);
// 3: test if the weak reference is still valid
$res = $wr->get();
if (!is_null($res)) {
// a handle to the object is still held somewhere else in addition to $obj
$obj = $res;
unset($res);
}
Upvotes: 3
Reputation: 3280
I know this is a very old issue, but it still came up as a top result in a search so I thought I'd give you the "correct" answer to your problem.
Unfortunately getting the reference count as you've found is a minefield, but in reality you don't need it for 99% of problems that might want it.
What you really want to use is the WeakRef class, quite simply it holds a weak reference to an object, which will expire if there are no other references to the object, allowing it to be cleaned up by the garbage collector. It needs to be installed via PECL, but it really is something you want in every PHP installation.
You would use it like so (please forgive any typos):
class Cache {
private $max_size;
private $cache = [];
private $expired = 0;
public function __construct(int $max_size = 1024) { $this->max_size = $max_size; }
public function add(int $id, object $value) {
unset($this->cache[$id]);
$this->cache[$id] = new WeakRef($value);
if ($this->max_size > 0) && ((count($this->cache) > $this->max_size)) {
$this->prune();
if (count($this->cache) > $this->max_size) {
array_shift($this->cache);
}
}
}
public function get(int $id) { // ?object
if (isset($this->cache[$id])) {
$result = $this->cache[$id]->get();
if ($result === null) {
// Prune if the cache gets too empty
if (++$this->expired > count($this->cache) / 4) {
$this->prune();
}
} else {
// Move to the end so it is culled last if non-empty
unset($this->cache[$id]);
$this->cache[$id] = $result;
}
return $result;
}
return null;
}
protected function prune() {
$this->cache = array_filter($this->cache, function($value) {
return $value->valid();
});
}
}
This is the overkill version that uses both weak references and a max size (set it to -1 to disable that). Basically if it gets too full or too many results were expired, then it will prune the cache of any empty references to make space, and only drop non-empty references if it has to for sanity.
Upvotes: 3
Reputation: 231
It seems like the best answer was still getting the reference count, although debug_zval_dump and ob_start was too ugly a hack to include in my application.
Instead I coded up a simple PHP module with a refcount() function, available at: http://github.com/qix/php_refcount
Upvotes: 6
Reputation: 4987
Sean's debug_zval_dump function looks like it will do the job of telling you the refcount, but really, the refcount doesn't help you in the long run.
You should consider using a bounded array to act as a cache; something like this:
<?php
class object_cache {
var $objs = array();
var $max_objs = 1024; // adjust to fit your use case
function add($obj) {
$key = $obj->getKey();
// remove it from its old position
unset($this->objs[$key]);
// If the cache is full, retire the eldest from the front
if (count($this->objs) > $this->max_objs) {
$dead = array_shift($this->objs);
// commit any pending changes to db/disk
$dead->flushToStorage();
}
// (re-)add this item to the end
$this->objs[$key] = $obj;
}
function get($key) {
if (isset($this->objs[$key])) {
$obj = $this->objs[$key];
// promote to most-recently-used
unset($this->objs[$key]);
$this->objs[$key] = $obj;
return $obj;
}
// Not cached; go and get it
$obj = $this->loadFromStorage($key);
if ($obj) {
$this->objs[$key] = $obj;
}
return $obj;
}
}
Here, getKey() returns some unique id for the object that you want to store. This relies on the fact that PHP remembers the order of insertion into its hash tables; each time you add a new element, it is logically appended to the array.
The get() function makes sure that the objects you access are kept at the end of the array, so the front of the array is going to be least recently used element, and this is the one that we want to dispose of when we decide that space is low; array_shift() does this for us.
This approach is also known as a most-recently-used, or MRU cache, because it caches the most recently used items. The idea is that you are more likely to access the items that you have accessed most recently, so you keep them around.
What you get here is the ability to control the maximum number of objects that you keep around, and you don't have to poke around at the php implementation details that are deliberately difficult to access.
Upvotes: 6
Reputation: 5403
Yes, you can definitely get the refcount from PHP. Unfortunately, the refcount isn't easily gotten for it doesn't have an accessor built into PHP. That's ok, because we have PREG!
<?php
function refcount($var)
{
ob_start();
debug_zval_dump($var);
$dump = ob_get_clean();
$matches = array();
preg_match('/refcount\(([0-9]+)/', $dump, $matches);
$count = $matches[1];
//3 references are added, including when calling debug_zval_dump()
return $count - 3;
}
?>
Source: PHP.net
Upvotes: 3