adam björkman
adam björkman

Reputation: 3

Regex match lines except linebreaks (FLEX, BISON)

So we have a tutorial on flex,bison before we start our complation techniques course at university.

The following test should be split into lines and newlines

testtest test data
second line in the data
another line without a trailing newline

This is what my parser should output:

Line: testtest test data
NL
Line: second line in the data
NL 
Line: another line without a trailing newline

When im running following

cat test.txt | ./parser 

This returns:

LINE: testtest test data
It's a bad: syntax error

This is in my .y file:

 %{
  #include<stdio.h>
  int yylex();            /* Supress C99 warning on OSX */
  extern char *yytext;    /* Correct for Flex */
  unsigned int total;

%}
%token LINE
%token NL
%%
line    : LINE              {printf("LINE: %s\n", yytext);}
        ;
newline : NL                {printf("NL\n");}
        ;

And this is in my binary.flex file:

    %top{
#define YYSTYPE int
#include "binary.tab.h"         /* Token values generated by bison */
}
%option noyywrap
%%
[^\n\r/]+   return LINE; 
\n          return NL;      
%%

So, any ideas to solve this problem ?

PS: This is my .c file

#include<stdio.h>
#include "binary.tab.h"
extern unsigned int total;

int yyerror(char *c)
{
  printf("It's a bad: %s\n", c);
  return 0;
}

int main(int argc, char **argv)
{
  if(!yyparse())
    printf("It's a mario time: %d\n",total);
  return 0;
}

Upvotes: 0

Views: 302

Answers (1)

rici
rici

Reputation: 241771

Your bison grammar recognizes precisely one LINE (without a newline) because the bison grammar recognizes the first non-terminal. Just that, and no more.

If you want to recognize multiples lines, each consisting of a LINE and possibly a NL, you'll need to add a definition for an input consisting of multiple lines, each consisting of ... . I'm not sure why you would use bison for this, though, since the original problem seems easy to solve with just flex.

By the way, if your input file includes a \r character, none of your flex patterns will recognize it (the flex-generated default rule will catch it, but that is almost never what you want). Use %option nodefault so that you get a warning about this sort of error. And react when you see warnings: you will have seen several when you ran bison on your bison file, I'm sure.

Upvotes: 1

Related Questions