bright-star
bright-star

Reputation: 6437

Why does this variable definition imply static lifetime?

I'm trying to execute a function on chunks of a vector and then send the result back using the message passing library.

However, I get a strange error about the lifetime of the vector that isn't even participating in the thread operations:

src/lib.rs:153:27: 154:25 error: borrowed value does not live long enough
src/lib.rs:153   let extended_segments = (segment_size..max_val)
error: src/lib.rs:154     .collect::<Vec<_>>()borrowed value does not live long enough

note: reference must be valid for the static lifetime...:153
  let extended_segments = (segment_size..max_val)
src/lib.rs:153:3: 155:27: 154     .collect::<Vec<_>>()
note: but borrowed value is only valid for the statement at 153:2:
reference must be valid for the static lifetime...
src/lib.rs:
let extended_segments = (segment_size..max_val)
consider using a `let` binding to increase its lifetime

I tried moving around the iterator and adding lifetimes to different places, but I couldn't get the checker to pass and still stay on type.

The offending code is below, based on the concurrency chapter in the Rust book. (Complete code is at github.)

use std::sync::mpsc;
use std::thread;

fn sieve_segment(a: &[usize], b: &[usize]) -> Vec<usize> {
    vec![]
}
fn eratosthenes_sieve(val: usize) -> Vec<usize> {
    vec![]
}

pub fn segmented_sieve_parallel(max_val: usize, mut segment_size: usize) -> Vec<usize> {
    if max_val <= ((2 as i64).pow(16) as usize) {
        // early return if the highest value is small enough (empirical)
        return eratosthenes_sieve(max_val);
    }

    if segment_size > ((max_val as f64).sqrt() as usize) {
        segment_size = (max_val as f64).sqrt() as usize;
        println!("Segment size is larger than √{}. Reducing to {} to keep resource use down.",
                 max_val,
                 segment_size);
    }

    let small_primes = eratosthenes_sieve((max_val as f64).sqrt() as usize);
    let mut big_primes = small_primes.clone();

    let (tx, rx): (mpsc::Sender<Vec<usize>>, mpsc::Receiver<Vec<usize>>) = mpsc::channel();

    let extended_segments = (segment_size..max_val)
        .collect::<Vec<_>>()
        .chunks(segment_size);
    for this_segment in extended_segments.clone() {
        let small_primes = small_primes.clone();
        let tx = tx.clone();

        thread::spawn(move || {
            let sieved_segment = sieve_segment(&small_primes, this_segment);
            tx.send(sieved_segment).unwrap();
        });
    }

    for _ in 1..extended_segments.count() {
        big_primes.extend(&rx.recv().unwrap());
    }

    big_primes
}

fn main() {}

How do I understand and avoid this error? I'm not sure how to make the lifetime of the thread closure static as in this question and still have the function be reusable (i.e., not main()). I'm not sure how to "consume all things that come into [the closure]" as mentioned in this question. And I'm not sure where to insert .map(|s| s.into()) to ensure that all references become moves, nor am I sure I want to.

Upvotes: 4

Views: 108

Answers (1)

Shepmaster
Shepmaster

Reputation: 430635

When trying to reproduce a problem, I'd encourage you to create a MCVE by removing all irrelevant code. In this case, something like this seems to produce the same error:

fn segmented_sieve_parallel(max_val: usize, segment_size: usize) {
    let foo = (segment_size..max_val)
        .collect::<Vec<_>>()
        .chunks(segment_size);
}

fn main() {}

Let's break that down:

  1. Create an iterator between numbers.
  2. Collect all of them into a Vec<usize>.
  3. Return an iterator that contains references to the vector.
  4. Since the vector isn't bound to any variable, it's dropped at the end of the statement. This would leave the iterator pointing to an invalid region of memory, so that's disallowed.

Check out the definition of slice::chunks:

fn chunks(&self, size: usize) -> Chunks<T>

pub struct Chunks<'a, T> where T: 'a {
    // some fields omitted
}

The lifetime marker 'a lets you know that the iterator contains a reference to something. Lifetime elision has removed the 'a from the function, which looks like this, expanded:

fn chunks<'a>(&'a self, size: usize) -> Chunks<'a, T>

Check out this line of the error message:

help: consider using a let binding to increase its lifetime

You can follow that as such:

fn segmented_sieve_parallel(max_val: usize, segment_size: usize) {
    let foo = (segment_size..max_val)
        .collect::<Vec<_>>();
    let bar = foo.chunks(segment_size);
}

fn main() {}

Although I'd write it as

fn segmented_sieve_parallel(max_val: usize, segment_size: usize) {
    let foo: Vec<_> = (segment_size..max_val).collect();
    let bar = foo.chunks(segment_size);
}

fn main() {}

Re-inserting this code back into your original problem won't solve the problem, but it will be much easier to understand. That's because you are attempting to pass a reference to thread::spawn, which may outlive the current thread. Thus, everything passed to thread::spawn must have the 'static lifetime. There are tons of questions that detail why that must be prevented and a litany of solutions, including scoped threads and cloning the vector.

Cloning the vector is the easiest, but potentially inefficient:

for this_segment in extended_segments.clone() {
    let this_segment = this_segment.to_vec();
    // ...
}

Upvotes: 4

Related Questions