Kity Cartman
Kity Cartman

Reputation: 896

How do I return rust iterator from a python module function using pyo3

Its not like I am not able to return any rust iterators from a python module function using pyo3. The problem is when lifetime doesn't live long enough!

Allow me to explain.

First attempt:

#[pyclass]
struct ItemIterator {
    iter: Box<dyn Iterator<Item = u64> + Send>,
}

#[pymethods]
impl ItemIterator {
    fn __iter__(slf: PyRef<'_, Self>) -> PyRef<'_, Self> {
        slf
    }
    fn __next__(mut slf: PyRefMut<'_, Self>) -> Option<u64> {
        slf.iter.next()
    }
}

#[pyfunction]
fn get_numbers() -> ItemIterator {
    let i = vec![1u64, 2, 3, 4, 5].into_iter();
    ItemIterator { iter: Box::new(i) }
}

In the contrived example above I have written a python iterator wrapper for our rust iterator as per pyo3 guide and it works seemlessly.

Second attempt: The problem is when lifetimes are involved.

Say now I have a Warehouse struct that I would want make available as python class alongside pertaining associated functions.

struct Warehouse {
    items: Vec<u64>,
}

impl Warehouse {
    fn new() -> Warehouse {
        Warehouse {
            items: vec![1u64, 2, 3, 4, 5],
        }
    }

    fn get_items(&self) -> Box<dyn Iterator<Item = u64> + '_> {
        Box::new(self.items.iter().map(|f| *f))
    }
}

Implementing them as python class and methods

#[pyclass]
struct ItemIterator {
    iter: Box<dyn Iterator<Item = u64> + Send>,
}

#[pymethods]
impl ItemIterator {
    fn __iter__(slf: PyRef<'_, Self>) -> PyRef<'_, Self> {
        slf
    }
    fn __next__(mut slf: PyRefMut<'_, Self>) -> Option<u64> {
        slf.iter.next()
    }
}

#[pyclass]
struct Warehouse {
    items: Vec<u64>,
}

#[pymethods]
impl Warehouse {
    #[new]
    fn new() -> Warehouse {
        Warehouse {
            items: vec![1u64, 2, 3, 4, 5],
        }
    }

    fn get_items(&self) -> ItemIterator {
        ItemIterator {
            iter: Box::new(self.items.iter().map(|f| *f)),
        }
    }
}

This throws compiler error in getItems function saying:

error: lifetime may not live long enough
  --> src/lib.rs:54:19
   |
52 |     fn get_items(&self) -> ItemIterator {
   |                  - let's call the lifetime of this reference `'1`
53 |         ItemIterator {
54 |             iter: Box::new(self.items.iter().map(|f| *f)),
   |                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ cast requires that `'1` must outlive `'static`

error: could not compile `pyo3-example` due to previous error

I am not really sure how to fix this. Can someone explain what's really going on here. How does this compare to my first attempt implementing iterators and how to fix this?

Upvotes: 0

Views: 747

Answers (2)

Peter Hall
Peter Hall

Reputation: 58725

If we remove the python stuff:

struct ItemIterator {
    iter: Box<dyn Iterator<Item = u64> + Send>,
}

impl ItemIterator {
    fn __iter__(&self) -> &'_ ItemIterator {
        self
    }
    fn __next__(&mut self) -> Option<u64> {
        self.iter.next()
    }
}

We see the same error:

error: lifetime may not live long enough
  --> src/lib.rs:21:19
   |
19 |     fn get_items(&self) -> ItemIterator {
   |                  - let's call the lifetime of this reference `'1`
20 |         ItemIterator {
21 |             iter: Box::new(self.items.iter().map(|f| *f)),
   |                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ cast requires that `'1` must outlive `'static`

The problem is that the iterator holds a reference to underlying data, but there is nothing in the type to indicate so. When you then try to construct an instance that does hold references Rust is going to let you know about it.

Without the Python FFI it can be easily fixed with an extra lifetime on the iterator:

struct ItemIterator<'a> {
    iter: Box<dyn Iterator<Item = u64> + Send + 'a>,
}

Unfortunately, this won't work with the Python bindings because lifetimes and generics are not supported by pyo3. This is going to be annoying because it means that your iterator must own all of the items.

One quick fix would be to clone the vector so that the iterator owns its items. That way, no lifetimes are needed. This should work, but will be very inefficient if there is a lot of data.

Another approach is with shared ownership, using a reference-counting smart pointer; Rc or Arc, and interior mutability; RefCell, RwLock or Mutex. However, this change will have a knock-on effect - all usages of this vector will need to be changed to have to deal with the smart pointer.

use std::{rc::Rc, cell::RefCell};

#[pyclass]
struct ItemIterator {
    items: Rc<RefCell<Vec<u64>>>,
    index: usize,
}

#[pymethods]
impl ItemIterator {
    fn __iter__(&self) -> &'_ ItemIterator {
        self
    }
    fn __next__(&mut self) -> Option<u64> {
        let item = self.items.borrow().get(self.index).copied();
        self.index += 1;
        item
    }
}

#[pyclass]
struct Warehouse {
    items: Rc<RefCell<Vec<u64>>>,
}

#[pymethods]
impl Warehouse {
    fn get_items(&self) -> ItemIterator {
        ItemIterator {
            items: Rc::clone(&self.items),
            index: 0,
        }
    }
}

This should now work because the exposed types and functions do not use lifetimes.

Upvotes: 1

Kity Cartman
Kity Cartman

Reputation: 896

Based on @peter-hall suggestion I managed to implement a working solution (though inefficient):

#[pyclass]
struct ItemIterator {
    iter: std::vec::IntoIter<u64>,
}

#[pymethods]
impl ItemIterator {
    fn __iter__(slf: PyRef<'_, Self>) -> PyRef<'_, Self> {
        slf
    }
    fn __next__(mut slf: PyRefMut<'_, Self>) -> Option<u64> {
        slf.iter.next()
    }
}

#[pyclass]
struct Warehouse {
    items: Vec<u64>,
}

#[pymethods]
impl Warehouse {
    #[new]
    fn new() -> Warehouse {
        Warehouse {
            items: vec![1u64, 2, 3, 4, 5],
        }
    }

    fn get_items(&self) -> ItemIterator {
        ItemIterator {
            iter: self.items.collect::<Vec<_>>().into_iter(),
        }
    }
}

Upvotes: 0

Related Questions