Reputation: 63
I have a large nested data structure and would like to pluck out a few parts to pass around for processing. Ultimately I want to send sections to multiple threads to update but I'd like to get my feet wet understanding the simple example I illustrate below. In C I would just assemble an array of the relevant pointers. It seems doable in Rust as the interior vectors will never need multiple mutable references. Here's the sample code.
fn main() {
let mut data = Data::new(vec![2, 3, 4]);
// this works
let slice = data.get_mut_slice(1);
slice[2] = 5.0;
println!("{:?}", data);
// what I would like to do
// let slices = data.get_mut_slices(vec![0, 1]);
// slices[0][0] = 2.0;
// slices[1][0] = 3.0;
// println!("{:?}", data);
}
#[derive(Debug)]
struct Data {
data: Vec<Vec<f64>>,
}
impl Data {
fn new(lengths: Vec<usize>) -> Data {
Data {
data: lengths.iter().map(|n| vec![0_f64; *n]).collect(),
}
}
fn get_mut_slice(&mut self, index: usize) -> &mut [f64] {
&mut self.data[index][..]
}
// doesnt work
// fn get_mut_slices(&mut self, indexes: Vec<usize>) -> Vec<&mut [f64]> {
// indexes.iter().map(|i| self.get_mut_slice(*i)).collect()
// }
}
Upvotes: 2
Views: 6072
Reputation: 11
You could use split_at_mut
, but it doesn't work as well if you need more than two indices or if the order is unknown.
The other options is get_many_mut
on nightly, but just be aware it has to do a sort of the slice of indices you provide. Which may not be best option if you know the amount of indices, or the ordering, or the values at compile time.
For stable, this crate solves this issue and has some methods and macros optimized for a few different scenarios: https://github.com/mcmah309/indices
fn get_mut_slices(&mut self, indexes: Vec<usize>) -> Vec<&mut [f64]> {
indices::indices_slice(self.data, &mut indexes)
}
Upvotes: 0
Reputation: 15115
This is possible using safe Rust as long as you're very careful. The trick is to leverage the unsafe Rust code in the standard library behind the safe .iter_mut()
and .nth()
methods on Vec
. Here's a working example with comments explaining the code in context:
fn main() {
let mut data = Data::new(vec![2, 3, 4]);
// this works
let slice = data.get_mut_slice(1);
slice[2] = 5.0;
println!("{:?}", data);
// and now this works too!
let mut slices = data.get_mut_slices(vec![0, 1]);
slices[0][0] = 2.0;
slices[1][0] = 3.0;
println!("{:?}", data);
}
#[derive(Debug)]
struct Data {
data: Vec<Vec<f64>>,
}
impl Data {
fn new(lengths: Vec<usize>) -> Data {
Data {
data: lengths.iter().map(|n| vec![0_f64; *n]).collect(),
}
}
fn get_mut_slice(&mut self, index: usize) -> &mut [f64] {
&mut self.data[index][..]
}
// now works!
fn get_mut_slices(&mut self, mut indexes: Vec<usize>) -> Vec<&mut [f64]> {
// sort indexes for easier processing
indexes.sort();
let index_len = indexes.len();
// early return for edge case
if index_len == 0 {
return Vec::new();
}
// check that the largest index is in bounds
let max_index = indexes[index_len - 1];
if max_index > self.data.len() {
panic!("{} index is out of bounds of data", max_index);
}
// check that we have no overlapping indexes
indexes.dedup();
let uniq_index_len = indexes.len();
if index_len != uniq_index_len {
panic!("cannot return aliased mut refs to overlapping indexes");
}
// leverage the unsafe code that's written in the standard library
// to safely get multiple unique disjoint mutable references
// out of the Vec
let mut mut_slices_iter = self.data.iter_mut();
let mut mut_slices = Vec::with_capacity(index_len);
let mut last_index = 0;
for curr_index in indexes {
mut_slices.push(
mut_slices_iter
.nth(curr_index - last_index)
.unwrap()
.as_mut_slice(),
);
last_index = curr_index;
}
// return results
mut_slices
}
}
What I believe I learned is that the Rust compiler demands an iterator in this situation because that is the only way it can know that each mut slice comes from a different vector.
The compiler doesn't know that actually. All it knows is that the iterator returns mut references. The underlying implementation uses unsafe Rust but the method iter_mut()
itself is safe because the implementation guarantees to only emit each mut ref once and that all of the mut refs are unique.
Would the compiler complain if another
mut_slices_iter
was created in the for loop (which could grab the same data twice)?
Yes. Calling iter_mut()
on Vec
mutably borrows it and overlapping mutable borrows of the same data are against Rust's ownership rules, so you can't call iter_mut()
twice in the same scope (unless the iterator returned by the first call is dropped before the second call).
Also am I right that the
.nth
method will callnext()
n times so it is ultimately theta(n) on the first axis?
Not quite. That's the default implementation for nth
BUT the iterator returned by calling iter_mut()
on Vec
uses its own custom implementation and it seems to skip past items in the iterator without calling next()
so it should be as fast as if you were just regularly indexing into the Vec
, i.e. getting 3 randomly indexed items using .nth()
would be as fast on an iterator of 10000 items as it would be on an iterator of 10 items, although this is only specifically true for iterators created from collections which support fast random access like Vec
s.
Upvotes: 7
Reputation: 42708
If you want unique indexes, which actually makes more sense since you can't/shoudn't have two mutable references to the same element. You can use a HashSet
instead of a Vec
and use some iterator combinations:
fn get_mut_slices(&mut self, indexes: HashSet<usize>) -> Vec<&mut [f64]> {
self.data
.iter_mut()
.enumerate()
.filter(|(i, _)| indexes.contains(i))
.map(|(_, e)| e.as_mut_slice())
.collect()
}
You can still use Vec
for this option, but it would be much more inefficient when using contains
:
fn get_mut_slices(&mut self, indexes: Vec<usize>) -> Vec<&mut [f64]> {
self.data
.iter_mut()
.enumerate()
.filter(|(i, _)| indexes.contains(i))
.map(|(_, e)| e.as_mut_slice())
.collect()
}
Upvotes: 1