Reputation: 663
I have an application where I need to:
I'd like to do this with only 1 hash/lookup. Is that possible? Is it possible to do it with 2?
I'm computing Fibonacci to show the purpose. The function that would be nice to optimize from 3 lookups to 1 lookup is the index()
function:
// Written by beefucurry
use std::collections::HashMap;
use std::cell::UnsafeCell;
fn main() {
let fib = SaveCall::<u32, u32>::new(|k, fib_| {
if k < 2 {
return k;
} else {
return fib_[k - 1] + fib_[k - 2];
}
});
for k in (0..30).rev() {
println!("fib[{}] = {}", k, fib[k]);
}
}
pub struct SaveCall<'a, In: std::cmp::Eq + std::hash::Hash + Copy, Out> {
cache: UnsafeCell<HashMap<In, Option<Out>>>,
f: Box<dyn Fn(In, &SaveCall<'a, In, Out>) -> Out + 'a>,
}
impl<'a, In: std::cmp::Eq + std::hash::Hash + Copy, Out> SaveCall<'a, In, Out> {
pub fn new(f: impl Fn(In, &Self) -> Out + 'a) -> Self {
Self {
cache: UnsafeCell::new(HashMap::new()),
f: Box::new(f),
}
}
}
impl<'a, In: std::cmp::Eq + std::hash::Hash + Copy, Out> std::ops::Index<In>
for SaveCall<'a, In, Out>
{
type Output = Out;
fn index<'b>(&'b self, input: In) -> &'b Self::Output {
unsafe {
//println!("~~Looking up {:?}", input);
let cache = self.cache.get();
let saved = (*cache).get(&input);
if saved.is_some() {
//println!("~~Found {:?}", input);
let z = saved.unwrap().as_ref();
if z.is_some() {
return z.unwrap();
} else {
panic!("Attempted to compute a value based on itself");
}
}
(*cache).insert(input, None);
//println!("~~Computing {:?}", input);
let output: Out = (self.f)(input, self);
//println!("~~Writing to cache {:?}", input);
(*cache).insert(input, Some(output));
let output_ref = (*cache).get(&input).unwrap().as_ref().unwrap();
output_ref
}
}
}
Upvotes: 2
Views: 365
Reputation: 161627
I'd like to do this with only 1 hash/lookup. Is that possible?
It is not possible with a single hash lookup because Rust has no way to know whether the calculation itself might try to change the structure of your SaveCall
object. If your callback didn't accept self
as a parameter, then it would be possible, but not with the API as you want it.
Is it possible to do it with 2?
It is possible with 2 using HashMap::entry
, which lets you search for a key in the map, and modify the location and inspect whether or not it had a value, without having to recompute the hash.
You can use this to compute the hash once when you initially query the table and either return or insert None
, and then you can insert normally when you have the final result.
This takes us to your code itself. It is not possible to use the std::ops::Index
trait to do what you want. Your example code has tried to get around this with UnsafeCell
, but your unsafe code violates the requirements that unsafe code must adhere to.
The short version is that the Index
trait returns a reference. You specifically want to mutate the map by inserting entries as you access them, that means that the HashMap must grow and reallocate space for its values as it does so. If you have to re-allocate, it means that there is no way for the references returned by the index
to remain valid, since they were referencing to the memory that was re-allocated.
Where does that leave you?
Index
trait, and instead implement a method on your SavedCall
type that does not return a reference.Out
directly, then Out
needs to explicitly be a Copy
type. If you want to support non-Copy types, then you'd potentially need to return an Rc
or an Arc
.For example
impl<'a, In: std::cmp::Eq + std::hash::Hash + Copy, Out> SaveCall<'a, In, Out> {
fn get(&self, input: In) -> Rc<Out> {
or
impl<'a, In: std::cmp::Eq + std::hash::Hash + Copy, Out: Copy> SaveCall<'a, In, Out> {
fn get(&self, input: In) -> Out {
Here's a more complete version built on your example code on the Rust playground. This uses Rc to best demonstrate the example, but you can trim it down if Out: Copy
is fine.
Upvotes: 4