Reputation: 169
I'm reading through "learn rust with entirely too many linked lists" . In chapter 2.7 the author says that "We can't drop the contents of the Box after deallocating". How can a value be deallocated after it is dropped? Wouldn't the deallocation occur in the implementation of drop or are these separate concepts?
For some context, the author is explaining why the particular implementation of a linked list is not tail recursive and thus cannot be optimised by the compiler.
Linked List:
pub struct List {
head: Link,
}
enum Link {
Empty,
More(Box<Node>),
}
struct Node {
elem: i32,
next: Link,
}
Drop implementation :
impl Drop for List {
fn drop(&mut self) {
// NOTE: you can't actually explicitly call `drop` in real Rust code;
// we're pretending to be the compiler!
self.head.drop(); // tail recursive - good!
}
}
impl Drop for Link {
fn drop(&mut self) {
match *self {
Link::Empty => {} // Done!
Link::More(ref mut boxed_node) => {
boxed_node.drop(); // tail recursive - good!
}
}
}
}
impl Drop for Box<Node> {
fn drop(&mut self) {
self.ptr.drop(); // uh oh, not tail recursive!
deallocate(self.ptr);
}
}
impl Drop for Node {
fn drop(&mut self) {
self.next.drop();
}
}
Upvotes: 2
Views: 809
Reputation: 155507
Wouldn't the deallocation occur in the implementation of drop or are these separate concepts?
They're separate concepts. In the example code, Drop
means "clean up my contents/owned resources", deallocate
means "return the memory I was actually stored in to the pool".
The drop
method is run with a reference to the existing instance. The definition of drop
actually says:
When this method has been called,
self
has not yet been deallocated. That only happens after the method is over. If this wasn’t the case,self
would be a dangling reference.
As a result, the head
must be fully dropped before it can be deallocated. And the dropping process requires dropping all the nodes in the list first, each of which will deallocate the associated node only after the drop finishes, leading to a stack of drop
s, each of which will be followed by a deallocate
, but only after all the drop
s and deallocates
above them on the stack complete. Since you can't perform a single deallocate
until drop
has been called along the entire stack (at which point the drop
for the tail node returns and the tail node is the first deallocated), it can't be implemented tail-recursively.
Side-note: This is the exact same problem you see in C++ trying to implement a linked list using std::shared_ptr
or std::unique_ptr
for the next
pointer. Just like Rust, C++ distinguishes destruction (cleaning resources within the object) from deallocations (releasing the memory that held the object). When the head
unique_ptr
is cleared, before it can release the memory it has to tell "head
+ 1" to clear its own unique_ptr
, which in turn tells the "head
+ 2" to clear its unique_ptr
and so on. In every case, destruction (via the destructor) must precede deallocation (which might occur via operator delete
for heap allocated stuff, or just by the compiler contracting the stack without actually releasing any memory for stack allocated stuff).
Upvotes: 4
Reputation: 16515
The problem with tail-recursion on impl Drop for Box<Node>
is that we have to self.ptr.drop()
and then deallocate(self.ptr)
; tail-recursion would require us to deallocate(self.ptr)
and then self.ptr.drop()
(so drop()
recurses in the function's tail position).
The problem is that when we deallocate(self.ptr)
first, the content of self.ptr
is immediately in an undefined state because it's deallocated memory. If we then self.ptr.drop()
, we would need a &mut Box<Node>
to do the drop()
call (where this shows up as &mut self
).
Not only would this mean that the destructor effectively accesses memory that has been deallocated. It also means that we have to create a reference
in the form of a &mut Box<Node>
that points to memory that is in an undefined state. In Rust, this would be Undefined Behavior, as Rust gives very strict guarantees about the memory a reference
points to.
Upvotes: 1