BeetleJuice
BeetleJuice

Reputation: 40896

PHP garbage collection: why is this object still referenced

I just read the manual's page on Objects and references

I am using a 3rd party library that executes a callback I provide when the library is done with an object. In the callback I'd like to unlink the object in memory so the garbage collector (CG) can clean it.

function onDone($object){
    // do stuff
    unset($object);
}

This doesn't unset the object in the calling code's context. I understand that this is because $object in the function is just another copy of the reference to the data in memory. Unsetting that copy leaves the original reference intact. So I made a change:

function onDone(&$object){
    //do stuff
    unset($object);
}

Through passing-by-reference &, I thought the unset would actually cut off the only link to the data, freeing the GC to clean it up. That's not the case; the data is still accessible (see demo). Why is that?

Live Demo

Upvotes: 2

Views: 1876

Answers (1)

Sherif
Sherif

Reputation: 11943

TL;DR;

The short answer is that references are merely a way for two variables to share the same value and unset() only deletes a variable, not a value. The key thing to remember here is that variables have values, while references link values, not variables.

The Long and Long of It...

First, understanding how PHP removes objects from memory...

Objects are only removed from memory when the last reference to that object is deleted. When I say reference I don't mean the same thing as references in PHP, like the one you're describing here in the pass-by-reference example. Instead, any variable that is assigned the object, is considered something that references this object. As such PHP will not remove the object from memory as long as this is true.

Because you create the object outside of the function, then call the function, there are now two places that reference the same object. One is the global variable that instantiated the object. The second is the local variable, in your function, that's using the object. Pass-by-reference or assign-by-reference has no bearing on this behavior, whatsoever, because it's a totally different thing.

When you create an object in PHP, and assign it to a variable, the variable does not store the object itself. Instead, it stores a unique handle that points to the object in memory. The object is stored in an object store, that only PHP has direct control over. This is by design, because PHP manages memory for you. It does not expect you to understand or have to care about, how memory is allocated or freed. It tries to manage memory for you as efficiently as possible, through these abstractions.

So in the following code the object Foo is not really deleted until after we get to the last line of this example.

class Foo { }

$foo = new Foo; // Object is initialized and stored in memory

$fooCopy = $foo; // The same object handle is copied to $fooCopy

bar($foo);

function bar($foo) {
    unset($foo); // object is still in memory
}


baz($foo);

function baz(&$foo) {
    unset($foo); // object is still in memory
}


$foo->quix = 1; // object is still in memory

unset($foo); // object is still in memory because $fooCopy is still a reference

$fooCopy->quix++;

var_dump($fooCopy->quix); // int(2)

unset($fooCopy); // object is now deleted because last reference is gone

As you can see from this example there is a very good reason why PHP won't delete the object in these functions or even when we do unset($foo), because otherwise, by the time we get to the last few lines of this script, this code would not work as we expect it. The object would be freed prematurely. PHP just assumes that since you still have at least one variable pointing to the object, that you might still need to use it somewhere down the line. So it does not free it until it reaches a point where nothing points to that object (i.e. nothing can use it).

Ref Counted GC

This is called ref-counted GC. In principle, each time some variable points to the same place in memory, the ref-count is incremented. Each time a variable is deleted, the ref count to that memory is decremented. Once the ref-count reaches 0, then, and only then, will the memory be marked for garbage collection, and eventually cleaned up by the garbage collector. So in the example above, the variable $foo creates a ref-count of 1 to the object Foo, that's stored in memory. The variable $fooCopy increments that ref-count to 2. At the point we call the function bar() the ref-count is at 3. By the time we unset() or return from bar() the ref count goes back down to 2. Same thing with baz(), up to 3 and then down to 2 again. At the point we unset($foo) the ref-count is still 1. PHP will not delete the object. Finally, we reach unset($fooCopy) and the ref-count is now 0. Same thing would happen if the script just ended. PHP would implicitly just clean up all memory.

Why unset() on references doesn't work with Objects

To answer your question, specifically, about why using pass-by-reference and calling unset() on the object doesn't work, or doesn't remove the last reference to the object we actually have to explain a bit more about how references actually work in PHP by contrast to objects.

$obj = new stdClass;

foo($obj);

function foo(&$obj) {
    unset($obj);
}

var_dump($obj); // this is still here

A reference, in PHP, is a way of having two variables share the same value. But an object, is not stored inside of a variable in PHP. Instead, all that is stored is the handle that points to that object (i.e. another level of indirection). So by using pass-by-reference, all you've managed to accomplish is have two different variables share the same object handle. By deleting one of these variables you're still left with the other variable pointing to that handle. So the ref count is still at 1 regardless of which variable you delete.

Now, if you were to re-assign either variable the value null, then the one and only handle pointing to that object is now lost, and as such PHP will remove the object.

$obj = new stdClass;

foo($obj);

function foo(&$obj) {
    $obj = null;
}

var_dump($obj); // this is now null and the object is gone

Try to think of it like this. The thing that's actually assigned to $obj, both in the local variable, and global variable, is just the object's handle, which is the thing that points to the object itself. So all unset($obj) inside this function does is delete the local variable. Deleting one does not delete both.

php object memory

By assigning a value to a variable, that's a reference to another variable, however, you get the same value in both places.

php object memory references

Remember, unset($obj) inside the function, only deletes the local variable and breaks the reference. It does not delete the object, because the variable outside of the function still continues to reference the handle.

Upvotes: 4

Related Questions