Aleksandar Dimitrov
Aleksandar Dimitrov

Reputation: 9487

Is there a way to extend the lifetime of locally defined variables?

Use case: I've got a paginated Graphql API where many different entities return an opaque cursor and a boolean hasNext. I would like to make these entities available as a TryStream to allow computations to happen while all pages are being fetched.

I've defined a trait to abstract that. get_data fetches a single page:

trait PaginatedEntityQuery {
    type ResponseData;
    fn get_cursor(data: &Self::ResponseData) -> Option<String>;
    fn has_next_page(data: &Self::ResponseData) -> bool;
    fn get_data<'a>(
        &self, // access to query variables
        backend: &'a Backend,  // access to a backend to get data from
        cursor: Option<String>,  // the cursor in the pagination
    ) -> Pin<Box<dyn Future<Output = Result<Self::ResponseData>> + 'a>>;
}

I'm using try_unfold in my Backend impl:

fn get_paginated_entities<'a, T>(
  &'a self,
  query: &'a TryStream<Ok = T::ResponseData, Error = anyhow::Error> + 'a
where
  T: PaginatedEntity
{
  try_unfold(StreamState::Start, async move |state| {
    // stream state handling code that defines `cursor`
    let data = query.get_data(self, cursor).await?
    // more data + state computations + return result of `get_data`.
  }
}

Now I'd like to define something like

struct SomeEntityQuery { filter: String }
impl PaginatedEntityQuery for SomeEntityQuery { /* … */ }

Finally, I can define a function that uses get_paginated_entities to do the heavy lifting:

fn get_some_entities(filter: String, /* … */) -> impl TryStream<Ok = Vec<SomeOtherType>, Error = anyhow::Error> + 'a {
  backend.get_paginated_entities(&SomeEntityQuery { filter }).map_ok /* … */
}

Of course, this doesn't work. I've defined an instance of SomeEntityQuery in get_some_entities but I'm returning owned values. Rust can't know that no parts of SomeEntityQuery actually show up in the return value.

I'd make query: &'a T in get_some_entities owned instead of a ref, but then I move t out in the async move that I give to try_unfold and since it's an FnMut that closure can't own it multiple times anyway. It has query has to be shared. Also, removing the ref form T gets Rust to complain about the type parameter potentially not living long enough.

Is there a way to make sure that values for the query itself aren't required to live as long as the values the query returns? Or possibly to extend the lifetimes of the query? I'm fine with copying the query a lot, if need be. It's lightweight, and performance is network bound.

Upvotes: 1

Views: 164

Answers (1)

dominicm00
dominicm00

Reputation: 1600

Let's first make a minimal working example of this that we can actually compile:

trait Query {
    fn get_data<'backend>(&self, backend: &'backend Backend) -> Result<'backend>;
}

struct Result<'data> {
    data: &'data str,
}

struct Backend {
    database_data: String,
}

struct SomeQuery {
    length_filter: usize,
}

impl Query for SomeQuery {
    fn get_data<'backend>(&self, backend: &'backend Backend) -> Result<'backend> {
        Result {
            // we use SomeQuery here, but it's not in the data
            data: &backend.database_data[..self.length_filter],
        }
    }
}

impl Backend {
    fn get_paginated_entities<'backend>(
        &'backend self,
        query: &'backend SomeQuery,
    ) -> Result<'backend> {
        query.get_data(self)
    }
}

fn get_some_entities<'backend>(
    length_filter: usize,
    backend: &'backend Backend,
) -> Result<'backend> {
    backend.get_paginated_entities(&SomeQuery { length_filter })
}

Hopefully now it's easier to find the problem; Rust thinks the data has the same lifetime as the query because we told it they do!

fn get_paginated_entities<'backend>(
    // the returned data is referenced from the backend,
    // so we specify a lifetime that Result lives as long
    // as backend does
    &'backend self,

    // but we told Rust here that query also has a lifetime
    // of 'backend, so the query also has to live as long as
    // the return value does
    query: &'backend SomeQuery,
) -> Result<'backend> {
    query.get_data(self)
}

We can use multiple lifetimes here to say that the lifetime of the return value doesn't have anything to do with the query lifetime (we don't actually need it in this example, but you would in your code).

fn get_paginated_entities<'backend, 'query>(
    &'backend self,
    // now the return data isn't linked to query's lifetime
    query: &'query SomeQuery,
) -> Result<'backend> {
    query.get_data(self)
}

This works fine because get_data also specifies that query's lifetime isn't linked to Result.

fn get_data<'backend>(&self, backend: &'backend Backend) -> Result<'backend>;

EDIT: the problem here also has to do with async; since we don't know when the code will run, the data must be 'static. We can fix this by using Rc instead of a normal reference. Rc effectively has a static lifetime since it can float around forever until there are no more references to it.

Upvotes: 3

Related Questions