Novemberland
Novemberland

Reputation: 550

Extraction of line contents fails in flex/bison

I try to extract the contents of a line and print them when the input in that line is rejected from bison. I try to reproduce these suggestions: http://archive.oreilly.com/pub/a/linux/excerpts/9780596155971/error-reporting-recovery.html but when input is rejected the next line is printed instead of the line which was rejected, while the number of the line is correctly printed.

flex:

%{
#include <stdio.h>
#include "parser.tab.h"
int line_number = 0;
char linebuf[500];
%}
...

%%
   \n.*  { ++line_number; strncpy(linebuf, yytext+1, sizeof(linebuf)); /* save the next line */
                yyless(1);      /* give back all but the \n to rescan */
              }
%%

bison:

 %{
    #include <stdio.h>
    #include <assert.h>
    #include <string.h>
    #include <stdlib.h>
    #include "parser.tab.h"

    extern int yylex(void);
    extern int line_number;
    extern char line_contents[500];
    void yyerror(char const *s);
    %}
...
%%
int main(){
if( yyparse() == 0)
printf("Accepted\n");
else
printf("Syntax error in line %d: %s\n" line_number, linebuf);
...

On input that is rejected from bison the approach above printd the next line from the one that contains the grammatical error.

input:
result = function //(semicolon expected)
else

output:

Syntax error in line 1: else

I believe the lexical rule \n.* or yytext+1 drives the output to the next line but which lexical rule is the correct one?

Upvotes: 0

Views: 214

Answers (1)

Chris Dodd
Chris Dodd

Reputation: 126243

This happens because bison uses a 1-token lookahead to parse. So the missing semicolon is not noticed (or diagnosed) until after the scanner reads and returns the ELSE token. At this point, the preceeding rule (which is expecting a semicolon or something to make a longer expression) can't match (no shift or reduce action on token ELSE in that state).

Once the error noticed, the parser calls yyerror which prints the message (and the most recently read line, which is the one with the ELSE token).

Upvotes: 1

Related Questions