Reputation: 27975
I'm trying out random things to deepen my understanding of Rust. I just ran into the following error with this code:
struct Person {
mother: Option<Person>,
father: Option<Person>,
partner: Option<Person>,
}
pub fn main() {
let susan = Person {
mother: None,
father: None,
partner: None,
};
let john = Person {
mother: None,
father: None,
partner: Some(susan),
};
}
error[E0072]: recursive type `Person` has infinite size
--> src/main.rs:1:1
|
1 | struct Person {
| ^^^^^^^^^^^^^ recursive type has infinite size
2 | mother: Option<Person>,
| ---------------------- recursive without indirection
3 | father: Option<Person>,
| ---------------------- recursive without indirection
4 | partner: Option<Person>,
| ----------------------- recursive without indirection
|
= help: insert indirection (e.g., a `Box`, `Rc`, or `&`) at some point to make `Person` representable
I understand that I can fix it if I put the Person
in a Box
, so this works:
struct Person {
mother: Option<Box<Person>>,
father: Option<Box<Person>>,
partner: Option<Box<Person>>,
}
pub fn main() {
let susan = Person {
mother: None,
father: None,
partner: None,
};
let john = Person {
mother: None,
father: None,
partner: Some(Box::new(susan)),
};
}
I would like to understand the full story behind that. I know that boxing means that it will be stored on the heap rather than the stack but I don't get why this indirection is necessary.
Upvotes: 100
Views: 35249
Reputation: 22396
Be aware that while a Box
does in fact fix your problem on the surface, it makes it impossible to actually create an instance of it if you intend to have a bidirectional link between the partners:
struct Person {
partner: Option<Box<Person>>,
}
pub fn main() {
let susan = Person { partner: None };
let mut john = Person {
partner: Some(Box::new(susan)),
};
// This moves `susan` into `john`, meaning `susan` is now only
// accessible through `john.partner`.
let susan = john.partner.as_mut().unwrap();
// It is literally impossible to set john as a partner of susan,
// no matter what you try. (without using `unsafe`)
susan.partner = Some(Box::new(john));
}
error[E0505]: cannot move out of `john` because it is borrowed
--> src/main.rs:18:35
|
14 | let susan = john.partner.as_mut().unwrap();
| --------------------- borrow of `john.partner` occurs here
...
18 | susan.partner = Some(Box::new(john));
| ------------- ^^^^ move out of `john` occurs here
| |
| borrow later used here
Box
is only useful if you have a tree-like ownership chain, where someone owns the topmost item.
Your situation is not completely lost, however, just a little more complicated.
For once, you could use Rc
instead of Box
to do this. This is a little dangerous, though, because a circular Rc
chain will leak and never be dropped unless you manually break the cycle at some point. Remember, Rust does not have a garbage collector.
One solution that I could see is Weak
, which is a version of Rc
that specifically does not keep the object it points to alive. It is made specifically for circular references like this. Note, however, that this makes the objects immutable and therefore we need to create interior mutability with RefCell
.
use std::{
cell::RefCell,
rc::{Rc, Weak},
};
#[derive(Debug)]
struct Person {
name: String,
partner: Option<Weak<RefCell<Person>>>,
}
impl Person {
fn partner_name(&self) -> Option<String> {
self.partner
.as_ref()
.map(|partner| Weak::upgrade(partner).unwrap())
.map(|partner| RefCell::borrow(&partner).name.clone())
}
}
pub fn main() {
let susan = Rc::new(RefCell::new(Person {
name: "Susan".to_string(),
partner: None,
}));
let john = Rc::new(RefCell::new(Person {
name: "John".to_string(),
partner: Some(Rc::downgrade(&susan)),
}));
// Now we can actually set them to be each other's partner:
RefCell::borrow_mut(&susan).partner = Some(Rc::downgrade(&john));
// Both `susan` and `john` are still accessible
println!("John: {:?}", john);
println!(
"John's partner: {:?}\n",
RefCell::borrow(&john).partner_name()
);
println!("Susan: {:?}", susan);
println!(
"Susan's partner: {:?}\n",
RefCell::borrow(&susan).partner_name()
);
// Note that `susan` and `john` do not keep each other alive, as they
// are only `Weak` references. Therefore dropping the `Rc` handle
// Will cause the `Weak` handle to lose the connection.
drop(susan);
println!("John's partner after dropping Susan:");
println!("{:?}", RefCell::borrow(&john).partner_name());
}
John: RefCell { value: Person { name: "John", partner: Some((Weak)) } }
John's partner: Some("Susan")
Susan: RefCell { value: Person { name: "Susan", partner: Some((Weak)) } }
Susan's partner: Some("John")
John's partner after dropping Susan:
thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', src/main.rs:16:51
Upvotes: 17
Reputation: 831
The Rust Programming Language has this to say about recursive types:
Rust needs to know at compile time how much space a type takes up. One kind of type whose size can’t be known at compile time is a recursive type where a value can have as part of itself another value of the same type. This nesting of values could theoretically continue infinitely, so Rust doesn’t know how much space a value of a recursive type needs. Boxes have a known size, however, so by inserting a box in a recursive type definition, we are allowed to have recursive types.
Basically, the struct would be of infinite size if you don't use boxing. E.g., Susan has a mother, father, and partner, each of which have a mother, father, and partner....etc. Boxing uses a pointer, which is a fixed size, and dynamic memory allocation.
Upvotes: 35
Reputation: 102006
Data inside struct
s and enum
s (and tuples) is stored directly inline inside the memory of the struct value. Given a struct like
struct Recursive {
x: u8,
y: Option<Recursive>
}
let's compute the size: size_of::<Recursive>()
. Clearly it has 1 byte from the x
field, and then the Option
has size 1 (for the discriminant) + size_of::<Recursive>()
(for the contained data), so, in summary, the size is the sum:
size_of::<Recursive>() == 2 + size_of::<Recursive>()
That is, the size would have to be infinite.
Another way to look at it is just expanding Recursive
repeatedly (as tuples, for clarity):
Recursive ==
(u8, Option<Recursive>) ==
(u8, Option<(u8, Option<Recursive>)>) ==
(u8, Option<(u8, Option<(u8, Option<Recursive>)>)>) ==
...
and all of this is stored inline in a single chunk of memory.
A Box<T>
is a pointer, i.e. it has a fixed size, so (u8, Option<Box<Recursive>>)
is 1 + 8 bytes. (One way to regard Box<T>
is that it's a normal T
with the guarantee that it has a fixed size.)
Upvotes: 144