Asik
Asik

Reputation: 22133

How to make this code more idiomatic F#?

I have a function A similar to this which applies a function B to every file in a directory. Each file has a certain number of "entries"; the function B takes the current total of entries as a parameter and returns the number of new entries found in the current file.

Also, I need to count the number of files processed and display this count each time a file is processed. Due to my imperative background, I came up with 2 mutable variables and a for loop.

let files = Directory.EnumerateFiles sourceDirectory
let mutable numEntries = 0
let mutable numFiles = Seq.length files
let mutable index = 0
for file in files do
     printfn "done %d of %d" index numFiles
     let numNewEntries = processFile file numEntries
     numEntries <- numEntries + numNewEntries
     index <- index + 1

So, a few questions:

Upvotes: 0

Views: 430

Answers (2)

John Palmer
John Palmer

Reputation: 25516

Here is a more functional example:

let files = Directory.EnumerateFiles sourceDirectory
let numFiles = Seq.length files
files 
|> Seq.mapi (fun idx file -> (idx,file)) // Get access to the index in a loop
|> Seq.fold (fun numentries (index,file) ->
         printfn "done %d of %d" index numFiles
         numentries + (processFile file numFiles)
         ) 0

By using mapi I am able to get access to the index in the loop, eliminating the first mutable variable. The second is eliminated by using fold to keep track of the total number of files rather than a mutable variable.

The main advantage of this is that without any mutable state it is possible to more easily convert the code to running in multiple threads. Also, as variables are constant, it becomes simpler to reason about the code.

Upvotes: 6

ildjarn
ildjarn

Reputation: 62975

Assuming that what you're ultimately after is the final value of numEntries, then here's my take:

let getNumEntries sourceDirectory =
    Directory.GetFiles sourceDirectory
    |> fun files -> (0, 0, files.Length), files
    ||> Array.fold (fun (index, numEntries, numFiles) file ->
        printfn "done %d of %d" index numFiles
        index + 1, numEntries + processFile file numEntries, numFiles)
    |> fun (_,numEntries,_) -> numEntries

If all you're after is side-effects in processFile rather than the final numEntries value, then replace fun (_,numEntries,_) -> numEntries with ignore.


Can you explain the advantages to a more idiomatic solution? I'm very new to functional programming and sometimes I don't see what's wrong with my dirty imperative for loops.

Besides being subjective, that's rather broad and has been answered much more thoroughly in multiple other answers than I could do here.

Upvotes: 1

Related Questions