Christoph
Christoph

Reputation: 27975

Why are recursive struct types illegal in Rust?

I'm trying out random things to deepen my understanding of Rust. I just ran into the following error with this code:

struct Person {
    mother: Option<Person>,
    father: Option<Person>,
    partner: Option<Person>,
}

pub fn main() {
    let susan = Person {
        mother: None,
        father: None,
        partner: None,
    };

    let john = Person {
        mother: None,
        father: None,
        partner: Some(susan),
    };
}

The error is:

error[E0072]: recursive type `Person` has infinite size
 --> src/main.rs:1:1
  |
1 | struct Person {
  | ^^^^^^^^^^^^^ recursive type has infinite size
2 |     mother: Option<Person>,
  |     ---------------------- recursive without indirection
3 |     father: Option<Person>,
  |     ---------------------- recursive without indirection
4 |     partner: Option<Person>,
  |     ----------------------- recursive without indirection
  |
  = help: insert indirection (e.g., a `Box`, `Rc`, or `&`) at some point to make `Person` representable

I understand that I can fix it if I put the Person in a Box, so this works:

struct Person {
    mother: Option<Box<Person>>,
    father: Option<Box<Person>>,
    partner: Option<Box<Person>>,
}

pub fn main() {
    let susan = Person {
        mother: None,
        father: None,
        partner: None,
    };

    let john = Person {
        mother: None,
        father: None,
        partner: Some(Box::new(susan)),
    };
}

I would like to understand the full story behind that. I know that boxing means that it will be stored on the heap rather than the stack but I don't get why this indirection is necessary.

Upvotes: 100

Views: 35249

Answers (3)

Finomnis
Finomnis

Reputation: 22396

Be aware that while a Box does in fact fix your problem on the surface, it makes it impossible to actually create an instance of it if you intend to have a bidirectional link between the partners:

struct Person {
    partner: Option<Box<Person>>,
}

pub fn main() {
    let susan = Person { partner: None };

    let mut john = Person {
        partner: Some(Box::new(susan)),
    };

    // This moves `susan` into `john`, meaning `susan` is now only
    // accessible through `john.partner`.
    let susan = john.partner.as_mut().unwrap();

    // It is literally impossible to set john as a partner of susan,
    // no matter what you try. (without using `unsafe`)
    susan.partner = Some(Box::new(john));
}
error[E0505]: cannot move out of `john` because it is borrowed
  --> src/main.rs:18:35
   |
14 |     let susan = john.partner.as_mut().unwrap();
   |                 --------------------- borrow of `john.partner` occurs here
...
18 |     susan.partner = Some(Box::new(john));
   |     -------------                 ^^^^ move out of `john` occurs here
   |     |
   |     borrow later used here

Box is only useful if you have a tree-like ownership chain, where someone owns the topmost item.

Your situation is not completely lost, however, just a little more complicated.

For once, you could use Rc instead of Box to do this. This is a little dangerous, though, because a circular Rc chain will leak and never be dropped unless you manually break the cycle at some point. Remember, Rust does not have a garbage collector.

One solution that I could see is Weak, which is a version of Rc that specifically does not keep the object it points to alive. It is made specifically for circular references like this. Note, however, that this makes the objects immutable and therefore we need to create interior mutability with RefCell.

use std::{
    cell::RefCell,
    rc::{Rc, Weak},
};

#[derive(Debug)]
struct Person {
    name: String,
    partner: Option<Weak<RefCell<Person>>>,
}

impl Person {
    fn partner_name(&self) -> Option<String> {
        self.partner
            .as_ref()
            .map(|partner| Weak::upgrade(partner).unwrap())
            .map(|partner| RefCell::borrow(&partner).name.clone())
    }
}

pub fn main() {
    let susan = Rc::new(RefCell::new(Person {
        name: "Susan".to_string(),
        partner: None,
    }));

    let john = Rc::new(RefCell::new(Person {
        name: "John".to_string(),
        partner: Some(Rc::downgrade(&susan)),
    }));

    // Now we can actually set them to be each other's partner:
    RefCell::borrow_mut(&susan).partner = Some(Rc::downgrade(&john));

    // Both `susan` and `john` are still accessible
    println!("John: {:?}", john);
    println!(
        "John's partner: {:?}\n",
        RefCell::borrow(&john).partner_name()
    );
    println!("Susan: {:?}", susan);
    println!(
        "Susan's partner: {:?}\n",
        RefCell::borrow(&susan).partner_name()
    );

    // Note that `susan` and `john` do not keep each other alive, as they
    // are only `Weak` references. Therefore dropping the `Rc` handle
    // Will cause the `Weak` handle to lose the connection.
    drop(susan);
    println!("John's partner after dropping Susan:");
    println!("{:?}", RefCell::borrow(&john).partner_name());
}
John: RefCell { value: Person { name: "John", partner: Some((Weak)) } }
John's partner: Some("Susan")

Susan: RefCell { value: Person { name: "Susan", partner: Some((Weak)) } }
Susan's partner: Some("John")

John's partner after dropping Susan:
thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', src/main.rs:16:51

Upvotes: 17

Steve Cobb
Steve Cobb

Reputation: 831

The Rust Programming Language has this to say about recursive types:

Rust needs to know at compile time how much space a type takes up. One kind of type whose size can’t be known at compile time is a recursive type where a value can have as part of itself another value of the same type. This nesting of values could theoretically continue infinitely, so Rust doesn’t know how much space a value of a recursive type needs. Boxes have a known size, however, so by inserting a box in a recursive type definition, we are allowed to have recursive types.

Basically, the struct would be of infinite size if you don't use boxing. E.g., Susan has a mother, father, and partner, each of which have a mother, father, and partner....etc. Boxing uses a pointer, which is a fixed size, and dynamic memory allocation.

Upvotes: 35

huon
huon

Reputation: 102006

Data inside structs and enums (and tuples) is stored directly inline inside the memory of the struct value. Given a struct like

struct Recursive {
    x: u8,
    y: Option<Recursive>
}

let's compute the size: size_of::<Recursive>(). Clearly it has 1 byte from the x field, and then the Option has size 1 (for the discriminant) + size_of::<Recursive>() (for the contained data), so, in summary, the size is the sum:

size_of::<Recursive>() == 2 + size_of::<Recursive>()

That is, the size would have to be infinite.

Another way to look at it is just expanding Recursive repeatedly (as tuples, for clarity):

Recursive ==
(u8, Option<Recursive>) ==
(u8, Option<(u8, Option<Recursive>)>) ==
(u8, Option<(u8, Option<(u8, Option<Recursive>)>)>) ==
...

and all of this is stored inline in a single chunk of memory.

A Box<T> is a pointer, i.e. it has a fixed size, so (u8, Option<Box<Recursive>>) is 1 + 8 bytes. (One way to regard Box<T> is that it's a normal T with the guarantee that it has a fixed size.)

Upvotes: 144

Related Questions