T Powers
T Powers

Reputation: 1647

Better way to get tree representation of directory using F#?

I am new(ish) to F# and am trying to get a tree representation of a filesystem directory. Here's what I came up with:

type FSEntry =
  | File of name:string
  | Directory of name:string * entries:seq<FSEntry>

let BuildFSDirectoryTreeNonTailRecursive path = 
  let rec GetEntries (directoryInfo:System.IO.DirectoryInfo) =
    directoryInfo.EnumerateFileSystemInfos("*", System.IO.SearchOption.TopDirectoryOnly)
    |> Seq.map (fun info -> 
        match info with 
        | :? System.IO.FileInfo as file -> File (file.Name)
        | :? System.IO.DirectoryInfo as dir -> Directory (dir.Name, GetEntries dir)
        | _ -> failwith "Illegal FileSystemInfo type"
        )

  let directoryInfo = System.IO.DirectoryInfo path
  Directory (path, GetEntries directoryInfo)

But... pretty sure that isn't tail recursive. I took a look at the generated IL and didn't see any tail prefix. Is there a better way to do this? I tried using an accumulator but didn't see how that helps. I tried mutual recursive functions and got nowhere. Maybe a continuation would work but I found that confusing.

(I know that stack-depth won't be an issue in this particular case but still would like to know how to tackle this non-tail recursion problem in general)

OTOH, it does seem to work. The following prints out what I am expecting:

let PrintFSEntry fsEntry =
  let rec printFSEntryHelper indent entry  =
      match entry with
      | File name -> printfn "%s%s" indent name
      | Directory(name, entries) ->
        printfn "%s\\%s" indent name
        entries 
        |> Seq.sortBy (function | File name -> 0 | Directory (name, entries) -> 1)
        |> Seq.iter (printFSEntryHelper (indent + "  "))

  printFSEntryHelper "" fsEntry

This should probably be a different question but... how does one go about testing BuildFSDirectoryTreeNonTailRecursive? I suppose I could create an interface and mock it like I would in C#, but I thought F# had better approaches.

Edited: Based on the initial comments, I specified that I know stack space probably isn't an issue. I also specify I'm mainly concerned with testing the first function.

Upvotes: 0

Views: 432

Answers (1)

scrwtp
scrwtp

Reputation: 13577

To expand on my comment from earlier - unless you anticipate working with inputs that would cause a stack overflow without tail recursion, there's nothing to be gained from making a function tail-recursive. For your case, the limiting factor is the ~260 characters in path name, beyond which most Windows APIs will start to break. You'll hit that way before you start running out of stack space due to non-tail recursion.

As for testing, you want your functions to be as close to a pure function as possible. This involves refactoring out the pieces of the function that are side-effecting. This is the case with both of your functions - one of them implicitly depends on the filesystem, the other prints text directly to the standard output.

I guess the refactoring I suggest is fairly close to Mark Seemann's points: few mocks - checked, few interfaces - checked, function composition - checked. The example you have however doesn't lend itself nicely to it, because it's an extremely thin veneer over EnumerateFileSystemInfo. I can get rid of System.IO like this:

type FSInfo = DirInfo of string * string | FileInfo of string

let build enumerate path = 
    let rec entries path = 
        enumerate path
        |> Seq.map (fun info ->
           match info with
           | DirInfo (name, path) ->  Directory(name, entries path)
           | FileInfo name -> File name)

    Directory(path, entries path)

And now I'm left with an enumerate: string -> seq<FSInfo> function that can easily be replaced with a test implementation that doesn't even touch the drive. Then the default implementation of enumerate would be:

let enumerateFileSystem path = 
    let directoryInfo = DirectoryInfo(path)
    directoryInfo.EnumerateFileSystemInfos("*", System.IO.SearchOption.TopDirectoryOnly)
    |> Seq.map (fun info -> 
        match info with 
        | :? System.IO.FileInfo as file -> FileInfo (file.Name)
        | :? System.IO.DirectoryInfo as dir -> DirInfo (dir.Name, dir.FullName)
        | _ -> failwith "Illegal FileSystemInfo type")

You can see that it has virtually the same shape as the build function, minus recursion, since the entire 'core' of your logic is in EnumerateFileSystemInfos which lives beyond your code. This is a slight improvement, not in any way test-induced damage, but still it's not something that will make it onto anyone's slides anytime soon.

Upvotes: 2

Related Questions