Bison error recovery and balanced parentheses

Question

Suppose we have a file where each line is composed of strings of balanced parentheses. We can recognise this by putting the following classic grammar into Bison:

/* ignoring preamble and lexer definitions */
%% 
lines: line | lines line;
line: balanced NEWLINE;

balanced: %empty
| balanced '(' balanced ')'
| balanced '[' balanced ']'
| balanced '{' balanced '}'
| balanced '<' balanced '>';

Next improvement is to handle imbalanced brackets (or braces, or ...). In this case we want the parser to blindly consume everything up to the NEWLINE and start afresh on the next line. We can do this using bison's error recovery, although it is ungainly. But the below will shift the error token when it encounters an imbalanced bracket; it gains access to the any nonterminal and this allows it to slurp the rest of the line.

%% 
lines: line | lines line;
line: balanced NEWLINE;

balanced: %empty
| balanced '(' balanced ')'
| balanced '[' balanced ']'
| balanced '{' balanced '}'
| balanced '<' balanced '>';

/* below are new */

any: %empty 
| any '(' | any ')' | any '[' | any ']'
| any '{' | any '}' | any '<' | any '>';

line:
balanced error ')' any NEWLINE
| balanced error ']' any NEWLINE { /* do something special when an out of order ']' is detected */ }
| balanced error '}' any NEWLINE
| balanced error '>' any NEWLINE;

My question is how to handle the next improvement: autocomplete truncated lines. For example, if our line is

<>(([])<

I want to be able to determine what the closing string is, in this case >) or equivalently (< as I can reverse orders and swap closing for opening tokens. I know how to detect and skip over truncated lines:

line: balanced error NEWLINE;

but this results in everything from the first open paren to be discarded. Is there any way of accessing bison's internal token stack before it pops everything? Or some grammatical construct that will let me unwind it one at a time?

I'm currently doing something disgusting: pushing the opening symbols [({< onto a stack in a context object from the lexer, and when balanced is reduced popping the stack, so if we enter error recovery I can access the open brackets via the context. But, well, this is rotten.

I've also tried looking at the internals of the bison parser but it seems the internal stack is accessible from semantic error actions only after it has popped everything off it.

Bison error recovery and balanced parentheses

Answers (1)

Related Questions