How to handle parser errors

Question

I'm writing a JSON parser and I have trouble coming up with good design for the error handling. Let's say that at some point, the lexical analyser founds a token with a lexical error in it. How should it react ? Should it stop right away or continue to the end of the string ? How do parser generally handle lexical errors ?

rici · Accepted Answer

It depends on the purpose of the application, but in most cases a JSON parser should stop at the first error.

JSON is a data interchange format. In most applications, the input was originally created programmatically, and a syntax error indicates a corrupted communication or buggy generator. If the encoded data had been stored in a database, it could indicate corrupted storage. It might even be an indication of an attack: an attempt to modify the data en route or to hand-craft problematic data.

In such cases, the best strategy is usually to simply drop the data rather than attempting to "fix" it. Returning some kind of detailed error message is discouraged, because (a) the originating application is unlikely to be able to handle such a response, and (b) it might give an attacker additional information. Handling the incorrect data by attempting to guess what the correct representation might be could silently hide bugs in the application generating the data.

Of course, for debugging and logging purposes it is useful to be able to provide more informative error reports. Even then, it is rarely if ever useful to proceed beyond the first error.

It is possible that the application intends to use JSON as a human-editable data descriptor, for example as a configuration file. In that case, being able to find and report multiple errors may be useful, as it would be in the parser for a programming language. (But even then, it is not necessary.)

How to handle parser errors

Answers (2)

Related Questions