Eric Smith
Eric Smith

Reputation: 123

FParsec - how to parse strings separated by pipes?

I'm using a FParsec to write a small org-mode parser, for fun, and I'm having a little trouble with parsing a table row into a list of strings. My current code looks like this:

let parseRowEntries :Parser<RowEntries, unit> =
    let skipInitialPipe = skipChar '|'
    let notaPipe  = function
        | '|' -> false
        | _ -> true
    let pipeSep = pchar '|'

    skipInitialPipe >>. sepEndBy (many1Satisfy notaPipe) pipeSep
    |>> RowEntries

This works fine until you parse the string |blah\n|blah\n|blah| which should fail because of the newline character. Unfortunately simply making \n false in the notaPipe condition causes the parser to stop after the first 'blah' and say it was parsed successfully. What I want the manySatisfy to do is parse (almost) any characters, stopping at the pipe, failing to parse for newlines (and likely the eof character).

I've tried using charsTillString but that also just halts parsing at the first pipe, without an error.

Upvotes: 1

Views: 312

Answers (1)

rmunn
rmunn

Reputation: 36678

If I've understood your spec correctly, this should work:

let parseOneRow :Parser<_, unit> =
    let notaPipe  = function
        | '|' -> false
        | '\n' -> false
        | _ -> true
    let pipe = pchar '|'

    pipe >>. manyTill (many1Satisfy notaPipe .>> pipe) (skipNewline <|> eof)

let parseRowEntries :Parser<_, unit> =
    many parseOneRow

run parseRowEntries "|row|with|four|columns|\n|second|row|"
// Success: [["row"; "with"; "four"; "columns"]; ["second"; "row"]]

The structure is that each row starts with a pipe, then the segments within a row are conceptually row|, with|, and so on. The .>> combinator discards the pipe. The reason the "till" part of that line uses skipNewline instead of newline is because the eof parser returns unit, so we need a parser that expects newlines and returns unit. That's the skipNewline parser.

I've tried throwing newlines in where they don't belong (before the pipes, for example) and that causes this parser to fail exactly as it should. It also fails if a column is empty (that is, two pipe characters occur side by side like ||), which I think is also what you want. If you want to allow empty rows, just use manySatisfy instead of many1Satisfy.

Upvotes: 1

Related Questions