Fixing a bad JSON grammar

Question

I've just started learning about parsing, and I wrote this simple parser in Haskell (using parsec) to read JSON and construct a simple tree for it. I am using the grammar in RFC 4627.

However, when I try parsing the string {"x":1 }, I'm getting the output:

parse error at (line 1, column 8):
unexpected "}"
expecting whitespace character or ","

This only seems to be happening when I have spaces before a closing brace (]) or mustachio (}).

What have I done wrong? If I avoid whitespace before a closing symbol, it works perfectly.

dflemstr · Accepted Answer

Parsec doesn't do rewinding and backtracking automatically. When you write sepBy member valueSeparator, the valueSeparator consumes white space, so the parser will parse your value like so:

{"x":1 }
[------- object
%        beginObject
 [-]     name
    %    nameSeparator
     %   jvalue
      [- valueSeparator
       X In valueSeparator: unexpected "}"

Legend:
[--]     full match
%        full char match
[--      incomplete match
X        incomplete char match

When the valueSeparator fails, Parsec won't go back and try a different combination of parses, because one character has already matched in valueSeparator.

You have two options to solve your problem:

Since white space is insignificant in JSON, always consume white space after a significant token, never before. So, a tok should only consume white space after the char, so its definition is tok c = char c *> ws ((*>) from Control.Applicative); apply the same rule to all the other parsers. Since you'll never consume white space after having entered the "wrong parser" that way, you won't end up having to back-track.
Use back-tracking in Parsec by adding try in front of parsers that might consume more than one character, and that should rewind their input if they fail.

EDIT: updated ASCII graphic to make more sense.

Fixing a bad JSON grammar

Answers (2)

Related Questions