Reputation: 4062
So I've read Why can't I store a value and a reference to that value in the same struct? and I understand why my naive approach to this was not working, but I'm still very unclear how to better handle my situation.
I have a program I wanted to structure like follows (details omitted because I can't make this compile anyway):
use std::sync::Mutex;
struct Species{
index : usize,
population : Mutex<usize>
}
struct Simulation<'a>{
species : Vec<Species>,
grid : Vec<&'a Species>
}
impl<'a> Simulation<'a>{
pub fn new() -> Self {...} //I don't think there's any way to implement this
pub fn run(&self) {...}
}
The idea is that I create a vector of Species
(which won't change for the lifetime of Simulation
, except in specific mutex-guarded fields) and then a grid representing which species live where, which will change freely. This implementation won't work, at least not any way I've been able to figure out. As I understand it, the issue is that pretty much however I make my new
method, the moment it returns, all of the references in grid
would becomine invalid as Simulation
and therefor Simulation.species
are moved to another location in the stack. Even if I could prove to the compiler that species
and its contents would continue to exist, they actually won't be in the same place. Right?
I've looked into various ways around this, such as making species
as an Arc
on the heap or using usize
s instead of references and implementing my own lookup function into the species vector, but these seem slower, messier or worse. What I'm starting to think is that I need to really re-structure my code to look something like this (details filled in with placeholders because now it actually runs):
use std::sync::Mutex;
struct Species{
index : usize,
population : Mutex<usize>
}
struct Simulation<'a>{
species : &'a Vec<Species>, //Now just holds a reference rather than data
grid : Vec<&'a Species>
}
impl<'a> Simulation<'a>{
pub fn new(species : &'a Vec <Species>) -> Self { //has to be given pre-created species
let grid = vec!(species.first().unwrap(); 10);
Self{species, grid}
}
pub fn run(&self) {
let mut population = self.grid[0].population.lock().unwrap();
println!("Population: {}", population);
*population += 1;
}
}
pub fn top_level(){
let species = vec![Species{index: 0, population : Mutex::new(0_)}];
let simulation = Simulation::new(&species);
simulation.run();
}
As far as I can tell this runs fine, and ticks off all the ideal boxes:
grid
uses simple references with minimal boilerplate for meBut, this feels very weird to me: the two-step initialization process of creating owned memory and then references can't be abstracted any way that I can see, which feels like I'm exposing an implementation detail to the calling function. top_level
has to also be responsible for establishing any other functions or (scoped) threads to run the simulation, call draw/gui functions, etc. If I need multiple levels of references, I believe I will need to add additional initialization steps to that level.
So, my question is just "Am I doing this right?". While I can't exactly prove this is wrong, I feel like I'm losing a lot of near-universal abstraction of the call structure. Is there really no way to return species
and simulation
as a pair at the end (with some one-off update to make all references point to the "forever home" of the data).
Phrasing my problem a second way: I do not like that I cannot have a function with a signature of ()-> Simulation
, when I can can have a pair of function calls that have that same effect. I want to be able to encapsulate the creation of this simulation. I feel like the fact that this approach cannot do so indicates I'm doing something wrong, and that there may be a more idiomatic approach I'm missing.
Upvotes: 2
Views: 509
Reputation: 2059
If you're owning a Vec
of objects, then want to also keep track of references to particular objects in that Vec
, a usize
index is almost always the simplest design. It might feel like extra boilerplate to you now, but it's a hell of a lot better than properly dealing with keeping pointers in check in a self-referential struct (as somebody who's made this mistake in C++ more than I should have, trust me). Rust's rules are saving you from some real headaches, just not ones that are obvious to you necessarily.
If you want to get fancy and feel like a raw usize
is too arbitrary, then I recommend you look at slotmap. For a simple SlotMap
, internally it's not much more than an array of values, iteration is fast and storage is efficient. But it gives you generational indices (slotmap calls these "keys") to the values: each value is embellished with a "generation" and each index also internally keeps hold of a its generation, therefore you can safely remove and replace items in the Vec
without your references suddenly pointing at a different object, it's really cool.
Upvotes: 2
Reputation: 43753
I've looked into various ways around this, such as making species as an Arc on the heap or using usizes instead of references and implementing my own lookup function into the species vector, but these seem slower, messier or worse.
Don't assume that, test it. I once had a self-referential (using ouroboros
) structure much like yours, with a vector of things and a vector of references to them. I tried rewriting it to use indices instead of references, and it was faster.
Rc
/Arc
is also an option worth trying out — note that there is only an extra cost to the reference counting when an Arc
is cloned or dropped. Arc<Species>
doesn't cost any more to dereference than &Species
, and you can always get an &Species
from an Arc<Species>
. So the reference counting only matters if and when you're changing which Species
is in an element of Grid
.
Upvotes: 5