James Parker
James Parker

Reputation: 477

Dereferencing a vector of references in Rust

I have the following DFS implementation in Rust that has a type inconsistency error right now.

fn dfs<T: Hash + Eq>(
    initial_state: T,
    is_final: &dyn Fn(&T) -> bool,
    next_states: &dyn Fn(&T) -> Vec<T>,
) -> Vec<T> {

    let mut final_states = Vec::new();

    {
        let mut stack = Vec::new();
        let mut seen = HashSet::new();
        stack.push(&initial_state);
        seen.insert(&initial_state);

        while let Some(node) = stack.pop() {
            if is_final(node) {
                final_states.push(node);
            }

            for ns in next_states(node) {
                if !seen.contains(&ns) {
                    seen.insert(&ns);
                    stack.push(&ns);
                }
            }
        }
    }
    final_states
}

The compiler currently complains because final_states is of type Vec<&T> and not Vec<T>. If I update the function signature to return Vec<T>, the compiler complains with an error returns a value referencing data owned by the current function and points to stack.push(&ns); and other lines with stack and seen.

Is there a way to update final_states of type Vec<&T> to Vec<T>?

If not, what is the method to resolve the problem?

Edit: If Copy trait is added to T (which I prefer not to do unless it is really necessary), there is another compiler issue with for ns next_states(node) that says "borrowed value does not live long enough". ns does not live outside of the loop.

Link to the playground

Upvotes: 2

Views: 1787

Answers (3)

kmdreko
kmdreko

Reputation: 59817

This is an extension on your self-answer.

  • You should use Clone instead of Copy because it is less restrictive. Many types are Clone but far fewer are Copy. This also exposes an additional copy being done via final_states.push(node) that isn't obvious without .clone().

  • You make a few unnecessary clones. The only .clone() calls that should be necessary should be when putting elements in the seen set, as otherwise the state can be moved into the stack. You can move states out of the next_states results by iterating over nss instead of &nss.

  • Moving the is_final check after the next_states step allows node to be moved to final_states instead of cloned. Since it is no longer used and has already been moved out of stack.

The updated code:

fn dfs<T: Clone + Eq + Hash>(
    initial_state: T,
    is_final: &dyn Fn(&T) -> bool,
    next_states: &dyn Fn(&T) -> Vec<T>,
) -> Vec<T> {
    let mut final_states = Vec::new();
    let mut stack = Vec::new();
    let mut seen = HashSet::new();
    seen.insert(initial_state.clone());
    stack.push(initial_state);

    while let Some(node) = stack.pop() {
        let nss = next_states(&node);
        for ns in nss {
            if !seen.contains(&ns) {
                seen.insert(ns.clone());
                stack.push(ns);
            }
        }
        
        if is_final(&node) {
            final_states.push(node);
        }
    }

    final_states
}

If cloning is unacceptable, you can have seen and stack/final_states share ownership by using Rcs. Though its a bit more work to pull out the final states into a Vec<T> at the end.

use std::rc::Rc;
use std::fmt::Debug;

fn dfs<T: Debug + Eq + Hash>(
    initial_state: T,
    is_final: &dyn Fn(&T) -> bool,
    next_states: &dyn Fn(&T) -> Vec<T>,
) -> Vec<T> {
    let mut final_states = Vec::new();
    let mut stack = Vec::new();
    let mut seen = HashSet::new();
    
    let initial_state = Rc::new(initial_state);
    seen.insert(Rc::clone(&initial_state));
    stack.push(initial_state);

    while let Some(node) = stack.pop() {
        let nss = next_states(&node);
        for ns in nss {
            if !seen.contains(&ns) {
                let ns = Rc::new(ns);
                seen.insert(Rc::clone(&ns));
                stack.push(ns);
            }
        }
        
        if is_final(&node) {
            final_states.push(node);
        }
    }
    
    // Ensure stack and seen are dropped early so that
    // final_states has unique ownership.
    std::mem::drop(stack);
    std::mem::drop(seen);
    
    final_states
        .into_iter()
        .map(|state| Rc::try_unwrap(state).unwrap())
        .collect()
}

Upvotes: 3

James Parker
James Parker

Reputation: 477

There is an issue with the lifetimes of variables in the code. Specifically, states generated from next_states. There is a couple of issues associated with it.

  1. Once the reference of the variable is moved, you cannot move the variable itself. For the case above, the variable ns is moved as a reference in seen.insert(&ns) and hence, it's not possible to return the variable outside of the function as Vec<T>.

  2. The memory of the variable has a scope. The memory allocated within the loop at next_states(node) is deallocated when it goes outside of the scope. Hence, the states cannot be accessed outside of the loop.

One solution to the problem is to clone the memories when it needs to be moved so that it avoids memory ownership problem altogether. However, this is not the most memory-efficient way to solve the problem and T needs to implement additional trait Copy.

Here is the updated code.

fn dfs<T: Copy + Eq + Hash>(
    initial_state: T,
    is_final: &dyn Fn(&T) -> bool,
    next_states: &dyn Fn(&T) -> Vec<T>,
) -> Vec<T> {
    let mut final_states: Vec<T> = Vec::new();
    let mut stack = Vec::new();
    let mut seen = HashSet::new();
    stack.push(initial_state.clone());
    seen.insert(initial_state);

    while let Some(node) = stack.pop() {
        if is_final(&node) {
            final_states.push(node);
        }

        let nss = next_states(&node);
        for ns in &nss {
            if !seen.contains(&ns) {
                seen.insert(ns.clone());
                stack.push(ns.clone());
            }
        }
    }
    final_states
}

Upvotes: 0

EvilTak
EvilTak

Reputation: 7579

Your graph structure is interesting, because it seems like you are generating the neighbors for each node on demand via a call to next_states. This is why the compiler complains when you change the return type to return a vector of references, as you are returning references to something (the neighbors of every visited node) that was generated during the execution of this function and will be dropped after the function finishes executing.

The easiest solution in this case is to make T Clone or Copy. This way, you can push a clone/copy of node onto final_states, and the compiler should stop complaining. Given that T implements both Hash and Eq, I would guess that it is a good candidate to be Cloneable (or Copyable) as well, as any reasonable implementation of Hash and Eq would ensure that x.clone() == x returns true.

Upvotes: 2

Related Questions