Balinth
Balinth

Reputation: 568

F# binding or composing together parsers acting on separate source streams

How do I compose parser functions, in a way that they execute on different source streams, while the later ones depend on the earlier one's results? Say the following two:

let outerP = many (pchar 'a') |>> (fun l -> l.Length)
let innerP x = pstring "something" |>> (fun str -> (str,x))

with a single source, the binding is working nicely:

let combinedP = outerP >>= innerP
run combinedP "aasomething"

but as part of a more complex project, I need to parse together several separate files, whit the later parsers using the earlier one's output. eg.: i have

let outerSource = "aaaaa"
let innerSource = "something"

The obvious solution is to just concatenate the files together, but is not very scalable, especially because there is in fact a list of inner source files, etc...

Background: I am new to functional programming, not sure if this is taking the function composition too far, but seems like it should be the good solution here, just I can't figure out it in this case. Below is the working but non functional solution, which leads to a multi level nested code in the real project.

What works with the separate source files:

let combinedScript =
    let outerR = run outerP outerSource
    match outerR with
    | Success (outerParam,_,_) ->
        let innerR = run (innerP outerParam) innerSource
        innerR

In the real code, this is a 4 level deep pyramid of doom, and looking at it, it is basically what bind does, just with an extra change (the different source)

Upvotes: 1

Views: 62

Answers (2)

Fyodor Soikin
Fyodor Soikin

Reputation: 80765

First, why do you think that your solution is non-functional? "Functional" doesn't mean "beautiful" or "elegant". Functional code can be just as ugly and convoluted as object-oriented. The mere fact that it's a pyramid of doom doesn't make it less functional.

Second, it is not "almost" what bind is doing, it is exactly what bind is doing. The fact that you have extra values in there doesn't change that. In fact, if the bound functions could use nothing but their immediate input, the usefulness of bind would be rather limited.

To avoid the pyramid of doom, you can leverage the F# syntax nicely. For example, this works:

let x =
    20    |> fun a ->
    a + 1 |> fun b ->
    b * 2

// x == 42

This example employs two nested functions, with result of the previous one passed into the next. It can be rewritten as:

let x =  (fun a -> (fun b -> b * 2) (a + 1)) 20

But I leverage operator |> and F# offside rules to make it look like a "step by step" computation.

You can do a similar thing if you define a similar composition operator for ParseResult<_,_>:

// (|>>=) : ParseResult<'a, 'e> -> ('a -> ParseResult<'b, 'e>) -> ParseResult<'b, 'e>
let (|>>=) res f = match res with 
    | Success (x, _, _) -> f x
    | Failure (x, y, z) -> Failure (x, y, z)

// And then use it:
let combinedScript =
   run outerP outerSource |>>= fun outerR ->
   run (innerP outerR) innerSource |>>= fun innerR ->
   run (nextP innerR) nextSource |>>= fun nextR ->
   ... and so on

Note that my implementation of operator |>>= throws away 'UserState and Position (the last two parameters of Success). If you don't care about those, this solution is sufficient. Otherwise, you'll need to figure out how to combine the ones that came in res with the ones returned by f x.

Upvotes: 2

rmunn
rmunn

Reputation: 36708

Your last sentence contains a clue to a good functional way to do this: "... looking at it, it is basically what bind does, just with an extra change (the different source)"

Let's turn your 4-level pyramid of doom into a nice-looking expression by implementing our own bind-like function. I'm going to take your combinedScript expression and turn outerP and outerSource (and innerP and innerSource) into function parameters, and you'll hopefully be pleased with the results.

let combinedScript (outerP, outerSource) (innerP, innerSource) =
    let outerR = run outerP outerSource
    match outerR with
    | Success (outerParam,_,_) ->
        let innerR = run (innerP outerParam) innerSource
        innerR
    | Failure (msg, err, state) ->
        Failure (msg, err, state)

// And we'll define an operator for it
let (>==>) (outerP, outerSource) (innerP, innerSource) =
    combinedScript (outerP, outerSource) (innerP, innerSource)

// Now you can parse your four files like this
let parseResult =
    (parserA, fileA)
    >==> (parserB, fileB)
    >==> (parserC, fileC)
    >==> (parserD, fileD)

What's really great about functional programming is that I wrote the above without having to even think about it, because turning pyramids of doom into flat lists is a well-known recipe. As you said, that's basically what "bind" does. And all I did above is follow the standard recipe for writing a "bind" function. If you don't yet know the standard recipe for "bind" functions, https://fsharpforfunandprofit.com/series/map-and-bind-and-apply-oh-my.html is the best explanation I've found. If you're anything like me, you'll have to read it about four or five times before something finally goes "click" in your brain, but once you have that "Ah-HA!" moment, you'll gain a far deeper understanding of the power of functional programming, and how it lets you do really advanced things really simply.

P.S. If that article series is too advanced for where you are right now in your understanding of FP, then try https://fsharpforfunandprofit.com/posts/recipe-part2/ and https://fsharpforfunandprofit.com/rop/. One of those might be a better introduction to these concepts, depending on how much you already understand.

Upvotes: 2

Related Questions