Friedrich Gretz
Friedrich Gretz

Reputation: 535

fparsec - limit number of characters that a parser is applied to

I have a problem where during the parsing of a stream I get to point where the next N characters need to be parsed by applying a specfic parser multiple times (in sequence).

(stripped down toy) Example:

17<tag><anothertag><a42...
  ^
  |- I'm here

Let's say the 17 indicates that the next N=17 characters make up tags, so I need to repetetively apply my "tagParser" but stop after 17 chars and not consume the rest even if it looks like a tag because that has a different meaning and will be parsed by another parser.

I cannot use many or many1 because that would eat the stream beyond those N characters. Nor can I use parray because I do not know how many successful applications of that parser are there within the N characters.

I was looking into manyMinMaxSatisfy but could not figure out how to make use of it in this case.

Is there a way to cut N chars of a stream and feed them to some parser? Or is there a way to invoke many applications but up to N chars?

Thanks.

Upvotes: 2

Views: 146

Answers (3)

Founder Fang
Founder Fang

Reputation: 11

You also can do this without going down to stream level.

open FParsec

let ptag =
    between
        (skipChar '<')
        (skipChar '>')
        (manySatisfy (fun c -> c <> '>'))

let tagsFromChars (l: char[]) =
    let s = new System.String(l)
    match run (many ptag) s with
    | Success(result, _, _) -> result
    | Failure(errorMsg, _, _) -> []

let parser =
    parse {
        let! nChars = pint32
        let! tags = parray nChars anyChar |>> tagsFromChars
        let! rest = restOfLine true
        return tags, rest
    }

run parser "17<tag><anothertag><a42..."
    |> printfn "%A"

Upvotes: 1

JL0PD
JL0PD

Reputation: 4518

Quite low-level parser, that operates on raw Reply objects. It reads count of chars, creates substring to feed to tags parser and consumes rest. There's should be an easier way, but I don't have much experience with FParsec

open FParsec

type Tag = Tag of string

let pTag = // parses tag string and constructs 'Tag' object
    skipChar '<' >>. many1Satisfy isLetter .>> skipChar '>' 
    |>> Tag

let pCountPrefixedTags stream =
    let count = pint32 stream // read chars count
    if count.Status = Ok then
        let count = count.Result
        // take exactly 'count' chars
        let tags = manyMinMaxSatisfy count count (fun _ -> true) stream
        if tags.Status = Ok then
            // parse substring with tags
            let res = run (many1 pTag) tags.Result
            match res with
            | Success (res, _, _) -> Reply(res)
            | Failure (_, error, _) -> Reply(ReplyStatus.Error, error.Messages)
        else
            Reply(tags.Status, tags.Error)
    else
        Reply(count.Status, count.Error)

let consumeStream =
    many1Satisfy (fun _ -> true)

run (pCountPrefixedTags .>>. consumeStream) "17<tag><anothertag><notTag..."
|> printfn "%A" // Success: ([Tag "tag"; Tag "anothertag"], "<notTag...")

Upvotes: 2

Brian Berns
Brian Berns

Reputation: 17153

You can use getPosition to make sure you don't go past the specified number of characters. I threw this together (using F# 6) and it seems to work, although simpler/faster solutions may be possible:

let manyLimit nChars p =
    parse {
        let! startPos = getPosition

        let rec loop values =
            parse {
                let! curPos = getPosition
                let nRemain = (startPos.Index + nChars) - curPos.Index
                if nRemain = 0 then
                    return values
                elif nRemain > 0 then
                    let! value = p
                    return! loop (value :: values)
                else
                    return! fail $"limit exceeded by {-nRemain} chars"
            }

        let! values = loop []
        return values |> List.rev
    }

Test code:

let ptag =
    between
        (skipChar '<')
        (skipChar '>')
        (manySatisfy (fun c -> c <> '>'))
    
let parser =
    parse {
        let! nChars = pint64
        let! tags = manyLimit nChars ptag
        let! rest = restOfLine true
        return tags, rest
    }

run parser "17<tag><anothertag><a42..."
    |> printfn "%A"

Output is:

Success: (["tag"; "anothertag"], "<a42...")

Upvotes: 3

Related Questions