Nick Fortescue
Nick Fortescue

Reputation: 44193

How do I list all files in a subdirectory in scala?

Is there a good "scala-esque" (I guess I mean functional) way of recursively listing files in a directory? What about matching a particular pattern?

For example recursively all files matching "a*.foo" in c:\temp.

Upvotes: 102

Views: 74614

Answers (23)

chinayangyongyong
chinayangyongyong

Reputation: 37

获取路径下所有文件,剔除文件夹

import java.io.File
import scala.collection.mutable.{ArrayBuffer, ListBuffer}

object pojo2pojo {

    def main(args: Array[String]): Unit = {
        val file = new File("D:\\tmp\\tmp")
        val files = recursiveListFiles(file)
        println(files.toList)
        // List(D:\tmp\tmp\1.txt, D:\tmp\tmp\a\2.txt)
    }

    def recursiveListFiles(f: File):ArrayBuffer[File] = {
        val all = collection.mutable.ArrayBuffer(f.listFiles:_*)
        val files = all.filter(_.isFile)
        val dirs = all.filter(_.isDirectory)
        files ++ dirs.flatMap(recursiveListFiles)
    }

}


Upvotes: 0

Sakthi Priyan H
Sakthi Priyan H

Reputation: 13

Minor improvement to the accepted answer.
By partitioning on the _.isDirectory this function returns list of files only.
(Directories are excluded)

import java.io.File
def recursiveListFiles(f: File): Array[File] = {
  val (dir, files)  = f.listFiles.partition(_.isDirectory)
  files ++ dir.flatMap(recursiveListFiles)
}

Upvotes: 0

Calvin Kessler
Calvin Kessler

Reputation: 31

The deepFiles method of scala.reflect.io.Directory provides a pretty nice way of recursively getting all the files in a directory:

import scala.reflect.io.Directory
new Directory(f).deepFiles.filter(x => x.startsWith("a") && x.endsWith(".foo"))

deepFiles returns an iterator so you can convert it some other collection type if you don't need/want lazy evaluation.

Upvotes: 1

Powers
Powers

Reputation: 19338

os-lib is the easiest way to recursively list files in Scala.

os.walk(os.pwd/"countries").filter(os.isFile(_))

Here's how to recursively list all the files that match the "a*.foo" pattern specified in the question:

os.walk(os.pwd/"countries").filter(_.segments.toList.last matches "a.*\\.foo")

os-lib is way more elegant and powerful than other alternatives. It returns os objects that you can easily move, rename, whatever. You don't need to suffer with the clunky Java libraries anymore.

Here's a code snippet you can run if you'd like to experiment with this library on your local machine:

os.makeDir(os.pwd/"countries")
os.makeDir(os.pwd/"countries"/"colombia")
os.write(os.pwd/"countries"/"colombia"/"medellin.txt", "q mas pues")
os.write(os.pwd/"countries"/"colombia"/"a_something.foo", "soy un rolo")
os.makeDir(os.pwd/"countries"/"brasil")
os.write(os.pwd/"countries"/"brasil"/"a_whatever.foo", "carnaval")
os.write(os.pwd/"countries"/"brasil"/"a_city.txt", "carnaval")

println(os.walk(os.pwd/"countries").filter(os.isFile(_))) will return this:

ArraySeq(
  /.../countries/brasil/a_whatever.foo, 
  /.../countries/brasil/a_city.txt, 
  /.../countries/colombia/a_something.foo, 
  /.../countries/colombia/medellin.txt)

os.walk(os.pwd/"countries").filter(_.segments.toList.last matches "a.*\\.foo") will return this:

ArraySeq(
  /.../countries/brasil/a_whatever.foo, 
  /.../countries/colombia/a_something.foo)

See here for more details on how to use the os-lib.

Upvotes: 2

yura
yura

Reputation: 14655

I would prefer solution with Streams because you can iterate over infinite file system(Streams are lazy evaluated collections)

import scala.collection.JavaConversions._

def getFileTree(f: File): Stream[File] =
        f #:: (if (f.isDirectory) f.listFiles().toStream.flatMap(getFileTree) 
               else Stream.empty)

Example for searching

getFileTree(new File("c:\\main_dir")).filter(_.getName.endsWith(".scala")).foreach(println)

Upvotes: 48

monzonj
monzonj

Reputation: 3709

As of Java 1.7 you all should be using java.nio. It offers close-to-native performance (java.io is very slow) and has some useful helpers

But Java 1.8 introduces exactly what you are looking for:

import java.nio.file.{FileSystems, Files}
import scala.collection.JavaConverters._
val dir = FileSystems.getDefault.getPath("/some/path/here") 

Files.walk(dir).iterator().asScala.filter(Files.isRegularFile(_)).foreach(println)

You also asked for file matching. Try java.nio.file.Files.find and also java.nio.file.Files.newDirectoryStream

See documentation here: http://docs.oracle.com/javase/tutorial/essential/io/walk.html

Upvotes: 35

Milind
Milind

Reputation: 1

You can use tail recursion for it:

object DirectoryTraversal {
  import java.io._

  def main(args: Array[String]) {
    val dir = new File("C:/Windows")
    val files = scan(dir)

    val out = new PrintWriter(new File("out.txt"))

    files foreach { file =>
      out.println(file)
    }

    out.flush()
    out.close()
  }

  def scan(file: File): List[File] = {

    @scala.annotation.tailrec
    def sc(acc: List[File], files: List[File]): List[File] = {
      files match {
        case Nil => acc
        case x :: xs => {
          x.isDirectory match {
            case false => sc(x :: acc, xs)
            case true => sc(acc, xs ::: x.listFiles.toList)
          }
        }
      }
    }

    sc(List(), List(file))
  }
}

Upvotes: 0

Phil
Phil

Reputation: 50506

No-one has mentioned yet https://github.com/pathikrit/better-files

val dir = "src"/"test"
val matches: Iterator[File] = dir.glob("**/*.{java,scala}")
// above code is equivalent to:
dir.listRecursively.filter(f => f.extension == 
                      Some(".java") || f.extension == Some(".scala")) 

Upvotes: 9

Phil
Phil

Reputation: 50506

for (file <- new File("c:\\").listFiles) { processFile(file) }

http://langref.org/scala+java/files

Upvotes: 20

draw
draw

Reputation: 4846

It seems nobody mentions the scala-io library from scala-incubrator...

import scalax.file.Path

Path.fromString("c:\temp") ** "a*.foo"

Or with implicit

import scalax.file.ImplicitConversions.string2path

"c:\temp" ** "a*.foo"

Or if you want implicit explicitly...

import scalax.file.Path
import scalax.file.ImplicitConversions.string2path

val dir: Path = "c:\temp"
dir ** "a*.foo"

Documentation is available here: http://jesseeichar.github.io/scala-io-doc/0.4.3/index.html#!/file/glob_based_path_sets

Upvotes: 1

Brent Faust
Brent Faust

Reputation: 9319

The simplest Scala-only solution (if you don't mind requiring the Scala compiler library):

val path = scala.reflect.io.Path(dir)
scala.tools.nsc.io.Path.onlyFiles(path.walk).foreach(println)

Otherwise, @Renaud's solution is short and sweet (if you don't mind pulling in Apache Commons FileUtils):

import scala.collection.JavaConversions._  // enables foreach
import org.apache.commons.io.FileUtils
FileUtils.listFiles(dir, null, true).foreach(println)

Where dir is a java.io.File:

new File("path/to/dir")

Upvotes: 3

polbotinka
polbotinka

Reputation: 498

I personally like the elegancy and simplicity of @Rex Kerr's proposed solution. But here is what a tail recursive version might look like:

def listFiles(file: File): List[File] = {
  @tailrec
  def listFiles(files: List[File], result: List[File]): List[File] = files match {
    case Nil => result
    case head :: tail if head.isDirectory =>
      listFiles(Option(head.listFiles).map(_.toList ::: tail).getOrElse(tail), result)
    case head :: tail if head.isFile =>
      listFiles(tail, head :: result)
  }
  listFiles(List(file), Nil)
}

Upvotes: 5

roterl
roterl

Reputation: 1883

Scala has library 'scala.reflect.io' which considered experimental but does the work

import scala.reflect.io.Path
Path(path) walkFilter { p => 
  p.isDirectory || """a*.foo""".r.findFirstIn(p.name).isDefined
}

Upvotes: 3

Nicolas Rouquette
Nicolas Rouquette

Reputation: 458

Why are you using Java's File instead of Scala's AbstractFile?

With Scala's AbstractFile, the iterator support allows writing a more concise version of James Moore's solution:

import scala.reflect.io.AbstractFile  
def tree(root: AbstractFile, descendCheck: AbstractFile => Boolean = {_=>true}): Stream[AbstractFile] =
  if (root == null || !root.exists) Stream.empty
  else
    (root.exists, root.isDirectory && descendCheck(root)) match {
      case (false, _) => Stream.empty
      case (true, true) => root #:: root.iterator.flatMap { tree(_, descendCheck) }.toStream
      case (true, false) => Stream(root)
    }

Upvotes: -1

Dino Fancellu
Dino Fancellu

Reputation: 2004

How about

   def allFiles(path:File):List[File]=
   {    
       val parts=path.listFiles.toList.partition(_.isDirectory)
       parts._2 ::: parts._1.flatMap(allFiles)         
   }

Upvotes: 3

Renaud
Renaud

Reputation: 16521

Apache Commons Io's FileUtils fits on one line, and is quite readable:

import scala.collection.JavaConversions._ // important for 'foreach'
import org.apache.commons.io.FileUtils

FileUtils.listFiles(new File("c:\temp"), Array("foo"), true).foreach{ f =>

}

Upvotes: 6

Connor Doyle
Connor Doyle

Reputation: 1912

This incantation works for me:

  def findFiles(dir: File, criterion: (File) => Boolean): Seq[File] = {
    if (dir.isFile) Seq()
    else {
      val (files, dirs) = dir.listFiles.partition(_.isFile)
      files.filter(criterion) ++ dirs.toSeq.map(findFiles(_, criterion)).foldLeft(Seq[File]())(_ ++ _)
    }
  }

Upvotes: 0

James Moore
James Moore

Reputation: 9026

And here's a mixture of the stream solution from @DuncanMcGregor with the filter from @Rick-777:

  def tree( root: File, descendCheck: File => Boolean = { _ => true } ): Stream[File] = {
    require(root != null)
    def directoryEntries(f: File) = for {
      direntries <- Option(f.list).toStream
      d <- direntries
    } yield new File(f, d)
    val shouldDescend = root.isDirectory && descendCheck(root)
    ( root.exists, shouldDescend ) match {
      case ( false, _) => Stream.Empty
      case ( true, true ) => root #:: ( directoryEntries(root) flatMap { tree( _, descendCheck ) } )
      case ( true, false) => Stream( root )
    }   
  }

  def treeIgnoringHiddenFilesAndDirectories( root: File ) = tree( root, { !_.isHidden } ) filter { !_.isHidden }

This gives you a Stream[File] instead of a (potentially huge and very slow) List[File] while letting you decide which sorts of directories to recurse into with the descendCheck() function.

Upvotes: 3

Rick-777
Rick-777

Reputation: 10268

Here's a similar solution to Rex Kerr's, but incorporating a file filter:

import java.io.File
def findFiles(fileFilter: (File) => Boolean = (f) => true)(f: File): List[File] = {
  val ss = f.list()
  val list = if (ss == null) {
    Nil
  } else {
    ss.toList.sorted
  }
  val visible = list.filter(_.charAt(0) != '.')
  val these = visible.map(new File(f, _))
  these.filter(fileFilter) ++ these.filter(_.isDirectory).flatMap(findFiles(fileFilter))
}

The method returns a List[File], which is slightly more convenient than Array[File]. It also ignores all directories that are hidden (ie. beginning with '.').

It's partially applied using a file filter of your choosing, for example:

val srcDir = new File( ... )
val htmlFiles = findFiles( _.getName endsWith ".html" )( srcDir )

Upvotes: 1

Duncan McGregor
Duncan McGregor

Reputation: 18177

I like yura's stream solution, but it (and the others) recurses into hidden directories. We can also simplify by making use of the fact that listFiles returns null for a non-directory.

def tree(root: File, skipHidden: Boolean = false): Stream[File] = 
  if (!root.exists || (skipHidden && root.isHidden)) Stream.empty 
  else root #:: (
    root.listFiles match {
      case null => Stream.empty
      case files => files.toStream.flatMap(tree(_, skipHidden))
  })

Now we can list files

tree(new File(".")).filter(f => f.isFile && f.getName.endsWith(".html")).foreach(println)

or realise the whole stream for later processing

tree(new File("dir"), true).toArray

Upvotes: 11

ArtemGr
ArtemGr

Reputation: 12567

Scala is a multi-paradigm language. A good "scala-esque" way of iterating a directory would be to reuse an existing code!

I'd consider using commons-io a perfectly scala-esque way of iterating a directory. You can use some implicit conversions to make it easier. Like

import org.apache.commons.io.filefilter.IOFileFilter
implicit def newIOFileFilter (filter: File=>Boolean) = new IOFileFilter {
  def accept (file: File) = filter (file)
  def accept (dir: File, name: String) = filter (new java.io.File (dir, name))
}

Upvotes: 11

Rex Kerr
Rex Kerr

Reputation: 167911

Scala code typically uses Java classes for dealing with I/O, including reading directories. So you have to do something like:

import java.io.File
def recursiveListFiles(f: File): Array[File] = {
  val these = f.listFiles
  these ++ these.filter(_.isDirectory).flatMap(recursiveListFiles)
}

You could collect all the files and then filter using a regex:

myBigFileArray.filter(f => """.*\.html$""".r.findFirstIn(f.getName).isDefined)

Or you could incorporate the regex into the recursive search:

import scala.util.matching.Regex
def recursiveListFiles(f: File, r: Regex): Array[File] = {
  val these = f.listFiles
  val good = these.filter(f => r.findFirstIn(f.getName).isDefined)
  good ++ these.filter(_.isDirectory).flatMap(recursiveListFiles(_,r))
}

Upvotes: 124

Don Mackenzie
Don Mackenzie

Reputation: 7963

Take a look at scala.tools.nsc.io

There are some very useful utilities there including deep listing functionality on the Directory class.

If I remember correctly this was highlighted (possibly contributed) by retronym and were seen as a stopgap before io gets a fresh and more complete implementation in the standard library.

Upvotes: 3

Related Questions