xehpuk
xehpuk

Reputation: 8240

Can I convert an Iterator of Result to Result of Iterator?

Until now, I have used std::fs::read_to_string and then String.lines's std::str::Lines (which is an Iterator<Item = &str>) to read a file "line by line". This obviously reads the whole file into memory, which is not ideal.

So, there's BufRead.lines() to read a file truly line by line. This returns std::io::Lines (which is an Iterator<Item = Result<String>>).

How do I convert from one iterator type to the other without collecting first?

Upvotes: 3

Views: 2161

Answers (4)

Ethan S-L
Ethan S-L

Reputation: 420

Great question. To highlight the problem: iterators use closures which are their own tiny universes. This means that you can't simply bubble up errors inside the iterator to a parent.

Contrary to what people are saying Iterator<..Result<_>> ~(try)~> Result<Iterator<.._>> is entirely reasonable so long as you're willing to accept that the transformation is fallible.

The bog standard library does not have a solution. You either work with iterators of Result (or Options or ControlFlow) types or you collect -- with .collect() implementing exactly this sort of logic.

Options:

  1. NOTE the standard option with .collect() is usually good. The compiler will often optimize away unnecessary allocations. So if you're not performance sensitive enough to be profiling anyway then you're probably good just using .collect() for convenience's sake. (And if you are profiling you can check whether you incur a cost.)

  2. Nightly: there is a nightly API that uses the Try trait, which abstracts over various things like Option, Result, and ControlFlow. So you an just use .try_for_each(|_| ..).

  3. Crate: If you want to stay on a stable version, but are down to use an outside crate then Itertools is a popular extender of iterator methods and has .process_results()

If you're in some very particular situation where you need to be on stable, not use an external crate, are profiling, and find that the compiler isn't able to remove the unnecessary allocation then you can write a custom iteration method or just use a for-loop.

Upvotes: 0

Stephen Funk
Stephen Funk

Reputation: 1

It makes sense to avoid a collection, since that would mean running through the whole iterator once just to unpack all of the Results.

I suspect that the problem you really want to solve is how to map/filter an iterator of Result<String, _> without calling unwrap on every line. The approach here is not to turn Iter<Result<T, E>> into Result<Iter<T>, E>, but rather unpacking the Result type inside each map/filter, then repackaging the output in a Result to push any errors through to the next step.

Here's a generic example:

use std::{
    fs::File,
    io::{BufReader, BufRead},
}

fn parse_line(input: String) -> usize {
    // .. Dummy code that works on an input line
    todo!()
}

fn parse_lines() {
    let lines: Lines<BufReader<File>> = BufReader::new(
        File::open("my_file.txt").unwrap()
    ).lines();

    // HERE! Iterate over Result<T>
    let new_iter = lines.map(|line: Result<String, Error>| {
        // We can't pass a `Result` into our `parse_line` function,
        // so we unpack it first.
        match line {
            // If no error, do work with the contents of the `Ok` val.
            Ok(s) => Ok(parse_line(s)),
            // We don't want to do any destructive error handling 
            // prematurely, so we pass any errors back up the chain.
            Err(e) => Err(e)
        }
    };
}

Note that in almost every instance, you'll still want to do some sort of error handling at the end of the map/filter chain. This is typically done by collecting the iterator into a Result type, e.g. Result<Vec<_>>, like mentioned by Chayim. But the approach I demonstrated avoids calling collect multiple times.

Upvotes: -1

Chayim Friedman
Chayim Friedman

Reputation: 71430

You cannot transform a Iterator<Item = Result<_, _>> into Result<Iterator<Item = _>, _> because if we haven't iterated the iterator yet we don't know whether we yield an error.

What you can do is to collect() all items ahead of time into a Result<Vec<_>, _> (which of course you can iterate over) since Result implements FromIterator.

If you're fine with getting Err only for the first Err (and successfully iterating over all items until that), you can also use itertools::process_results():

let result: Result<SomeType, _> = itertools::process_results(iter, |iter| -> SomeType {
    // Here we have `iter` of type `Iterator<Item = _>`. Process it and return some result.
});

Upvotes: 2

cafce25
cafce25

Reputation: 27549

You can't there has to be an owner of the values which is the full String in case of String.lines. You can however turn the Iterator<Item = Result<String> into an iterator over Strings:

let mut read = BufReader::new(File::open("src/main.rs").unwrap());
let lines_iter = read.lines().map(Result::unwrap_or_default);

You can take an Iterator over items of either String or &str like this:

fn solve<T: AsRef<str>>(input: impl Iterator<Item = T>) {
    for line in input {
        let line = line.as_ref();
        // do something with line
    }
}

Upvotes: -1

Related Questions