Alexey Orlov
Alexey Orlov

Reputation: 2824

Filtering files or directories discovered with fs::read_dir()

I've got this function:

fn folders(dir: &Path) -> Result<Vec<PathBuf>, io::Error> {
    fs::read_dir(dir)?
        .into_iter()
        .map(|x| x.map(|entry| entry.path()))
        .collect()
}

It's actually borrowed from here. The function is OK; unfortunately, I don't really understand how it works.

Ok(["/home/ey/dir-src/9", "/home/ey/dir-src/11", "/home/ey/dir-src/03 A Letter of Explanation.mp3", "/home/ey/dir-src/02 Egyptian Avenue.mp3", "/home/ey/dir-src/alfa", "/home/ey/dir-src/10"])

The test output shows both directories and files, just as it should. I can't figure out where to put filtering for files/directories. I don't understand why the mapping inside mapping: isn't it just a simple list of paths? What is really happening inside this expression?

UPD:

fn folders(dir: &Path) -> Result<Vec<PathBuf>, io::Error> {
    fs::read_dir(dir)?
        .into_iter()
        .map(|x| x.map(|entry| entry.path()))
        .filter(|x| {x.as_ref().map(|entry| entry); true})
        .collect()
}

A trivial filter (always true) inserted. It is compiling at least, but I still can't see how am I supposed to use entry for file/directory checking. Sorry :)

Upvotes: 4

Views: 6741

Answers (2)

Eric BURGHARD
Eric BURGHARD

Reputation: 41

you can merge the 2 filter call and keep the result from read_dir. the deref is to be able to check that the path is a dir, without consuming the result

fn folders(dir: &Path) -> Result<Vec<PathBuf>, io::Error> {
    fs::read_dir(dir)?
        .map(|r| r.map(|d| d.path()))
        .filter(|r| r.is_ok() && r.as_deref().unwrap().is_dir())
        .collect()
}

you can also combine filter and map with filter_map and get rid of the unwrap with ok().and_then()

fn folders(dir: &Path) -> Result<Vec<PathBuf>, io::Error> {
    Ok(fs::read_dir(dir)?
        .filter_map(|e| {
            e.ok().and_then(|d| {
                let p = d.path();
                if p.is_dir() {
                    Some(p)
                } else {
                    None
                }
            })
        })
        .collect())
}

Upvotes: 2

S&#233;bastien Renauld
S&#233;bastien Renauld

Reputation: 19672

Let's walk step by step through the chain.

fs::read_dir(dir)?

creates a read handle to the directory, immediately propagates the Err case if it happens, and if it does not, unwraps the success (that's the ? operator)

.into_iter()

turns this read handle into an iterator of Result<DirEntry>

.map(|x|
  x.map(|entry| entry.path())
)

This calls the path() method on every element of the iterator if the result is an actual DirEntry. Because the iterator element is Result<DirEntry> and not just DirEntry, the second map() allows you to deal with this cleanly. You're left with the paths you see on output

.collect()

turns this iterator back into a structure defined by the type hints (here, a vector)

The filtering part can be implemented before or after the call to map() to turn the entry into a PathBuf. If you need to filter based on the element itself and not the PathBuf, filter before it. If you can filter based on the PathBuf, filter after it.

The use of the filter() combinator function is straightforward - you give it a closure, it will apply it to every element. If the return of the closure is true, the element is kept. If it is false, the element is dropped.

Here is an example, to only return directories:

fn folders(dir: &Path) -> Result<Vec<PathBuf>, io::Error> {
    Ok(fs::read_dir(dir)?
        .into_iter()
        .filter(|r| r.is_ok()) // Get rid of Err variants for Result<DirEntry>
        .map(|r| r.unwrap().path()) // This is safe, since we only have the Ok variants
        .filter(|r| r.is_dir()) // Filter out non-folders
        .collect())
}

Upvotes: 9

Related Questions