Reputation: 3361
I'm trying to write my toy language with flex/bison tool chain in c++14.
I'm confused when using bison c++ variant with flex reentrant, yylex
cannot find the parameter yylval
.
My developing environment is the macbook with latest OS and XCode, homebrew installed latest flex 2.6.4 and bison 3.7.1.
For convience, you could download the project with error here: https://github.com/linrongbin16/tree.
Now let me introduce this not-so-simple tree
project:
makefile
clean:
rm *.o *.out *.yy.cc *.yy.hh *.tab.cc *.tab.hh *.output
tree.out: tree.o token.yy.o parser.tab.o
clang++ -std=c++14 -o tree.out tree.o token.yy.o parser.tab.o
token.yy.cc token.yy.hh: token.l
flex --debug -o token.yy.cc --header-file=token.yy.hh token.l
parser.tab.cc parser.tab.hh: parser.y
bison --debug --verbose -Wcounterexamples -o parser.tab.cc --defines=parser.tab.hh parser.y
token.yy.o: token.yy.cc
clang++ -std=c++14 -g -c token.yy.cc token.yy.hh
parser.tab.o: parser.tab.cc
clang++ -std=c++14 -g -c parser.tab.cc parser.tab.hh
tree.o: tree.cpp parser.tab.hh token.yy.hh
clang++ -std=c++14 -g -c tree.cpp
The application is a tree.out
, which depends on 3 components: tree
token
and parser
.
tree.h
defines a simple abstract syntax tree class, since I didn't implement it, it has only one virtual destructor:
#pragma once
struct Tree {
virtual ~Tree() = default;
};
tree.cpp
is the main
function, which read a filename from stdin
and initialize lexer and parser, and do the parsing:
#include "parser.tab.hh"
#include "token.yy.hh"
#include <cstdio>
#include <cstdlib>
struct Scanner {
yyscan_t yyscanner;
FILE *fp;
YY_BUFFER_STATE yyBufferState;
Scanner(const char *fileName) {
yylex_init_extra(this, &yyscanner);
fp = std::fopen(fileName, "r");
if (!fp) {
printf("file %s cannot open!\n", fileName);
exit(-1);
}
yyBufferState = yy_create_buffer(fp, YY_BUF_SIZE, yyscanner);
yy_switch_to_buffer(yyBufferState, yyscanner);
yyset_lineno(1, yyscanner);
}
virtual ~Scanner() {
if (yyBufferState) {
yy_delete_buffer(yyBufferState, yyscanner);
}
if (yyscanner) {
yylex_destroy(yyscanner);
}
if (fp) {
std::fclose(fp);
}
}
};
int main(int argc, char **argv) {
if (argc != 2) {
printf("missing file name!\n");
return -1;
}
Scanner scanner(argv[1]);
yy::parser parser(scanner.yyscanner);
if (parser.parse() != 0) {
printf("parsing failed!\n");
return -1;
}
return 0;
}
The important thing is that, I use bison c++ variant and flex reentrant feature, I want to make the project modern (with c++ 14) and safe with multiple threading. So it's a little complex when initializing. But it's worthy when project expand to a big one.
token.l
:
%option noyywrap noinput nounput
%option nodefault
%option nounistd
%option reentrant
%{
#include <cstdio>
#include <cstring>
#include "parser.tab.hh"
%}
%%
"+" { yylval->emplace<int>(yy::parser::token::PLUS); return yy::parser::token::PLUS; }
"-" { yylval->emplace<int>(yy::parser::token::MINUS); return yy::parser::token::MINUS; }
"*" { yylval->emplace<int>(yy::parser::token::TIMES); return yy::parser::token::TIMES; }
"/" { yylval->emplace<int>(yy::parser::token::DIVIDE); return yy::parser::token::DIVIDE; }
"(" { yylval->emplace<int>(yy::parser::token::LPAREN); return yy::parser::token::LPAREN; }
")" { yylval->emplace<int>(yy::parser::token::RPAREN); return yy::parser::token::RPAREN; }
";" { yylval->emplace<int>(yy::parser::token::SEMICOLON); return yy::parser::token::SEMICOLON; }
"=" { yylval->emplace<int>(yy::parser::token::EQUAL); return yy::parser::token::EQUAL; }
[a-zA-Z][a-zA-Z0-9]+ { yylval->emplace<std::string>(yytext); return yy::parser::token::ID; }
[0-9]+ { yylval->emplace<int>(atoi(yytext)); return yy::parser::token::NUM; }
%%
Here I followed bison split symbol manual (NOTICE: here we got the compiling error, I also tried the make_XXX
api, which also gives me error).
It generates token.yy.cc
token.yy.hh
, expect to compile a token.yy.o
object.
parser.y
:
%require "3.2"
%language "c++"
%define api.value.type variant
%define api.token.constructor
%define parse.assert
%define parse.error verbose
%define parse.lac full
%locations
%param {yyscan_t yyscanner}
%code top {
#include <memory>
}
%code requires {
#include <memory>
#include "token.yy.hh"
#include "tree.h"
#define SP_NULL (std::shared<Tree>(nullptr))
}
%token<int> PLUS '+'
%token<int> MINUS '-'
%token<int> TIMES '*'
%token<int> DIVIDE '/'
%token<int> SEMICOLON ';'
%token<int> EQUAL '='
%token<int> LPAREN '('
%token<int> RPAREN ')'
%token<int> NUM
%token<std::string> ID
%type<std::shared_ptr<Tree>> prog assign expr literal
/* operator precedence */
%right EQUAL
%left PLUS MINUS
%left TIMES DIVIDE
%start prog
%%
prog : assign { $$ = SP_NULL; }
| prog ';' assign { $$ = SP_NULL }
;
assign : ID '=' expr { $$ = SP_NULL; }
| expr { $$ = $1; }
;
expr : literal { $$ = SP_NULL; }
| expr '+' literal { $$ = SP_NULL; }
| expr '-' literal { $$ = SP_NULL; }
| expr '*' literal { $$ = SP_NULL; }
| expr '/' literal { $$ = SP_NULL; }
;
literal : ID { $$ = SP_NULL; }
| NUM { $$ = SP_NULL; }
;
%%
I followed the bison c++ variant manual, it generates parser.tab.cc
parser.tab.hh
parser.output
, the output file is just for analysis.
Since flex is reentrant, I need to add a parameter %param {yyscan_t yyscanner}
.
Here's the error message when making with make tree.out
:
bison --debug --verbose -Wcounterexamples -o parser.tab.cc --defines=parser.tab.hh parser.y
flex --debug -o token.yy.cc --header-file=token.yy.hh token.l
clang++ -std=c++14 -g -c tree.cpp
clang++ -std=c++14 -g -c token.yy.cc token.yy.hh
token.yy.cc:820:10: error: use of undeclared identifier 'yyin'; did you mean 'yyg'?
if ( ! yyin )
^~~~
yyg
token.yy.cc:807:23: note: 'yyg' declared here
struct yyguts_t * yyg = (struct yyguts_t*)yyscanner;
^
token.yy.cc:822:4: error: use of undeclared identifier 'yyin'
yyin = stdin;
^
token.yy.cc:827:10: error: use of undeclared identifier 'yyout'
if ( ! yyout )
^
token.yy.cc:829:4: error: use of undeclared identifier 'yyout'
yyout = stdout;
^
token.yy.cc:837:23: error: use of undeclared identifier 'yyin'
yy_create_buffer( yyin, YY_BUF_SIZE , yyscanner);
^
token.yy.cc:895:3: error: use of undeclared identifier 'YY_DO_BEFORE_ACTION'
YY_DO_BEFORE_ACTION;
^
token.yy.cc:902:8: error: use of undeclared identifier 'yy_flex_debug'; did you mean 'yyget_debug'?
if ( yy_flex_debug )
^~~~~~~~~~~~~
yyget_debug
token.yy.cc:598:5: note: 'yyget_debug' declared here
int yyget_debug ( yyscan_t yyscanner );
^
token.yy.cc:908:45: error: use of undeclared identifier 'yytext'
(long)yy_rule_linenum[yy_act], yytext );
^
token.yy.cc:911:14: error: use of undeclared identifier 'yytext'
yytext );
^
token.l:12:3: error: use of undeclared identifier 'yylval'
{ yylval->emplace<int>(yy::parser::token::PLUS); return yy::parser::token::PLUS; }
^
token.l:13:3: error: use of undeclared identifier 'yylval'
{ yylval->emplace<int>(yy::parser::token::MINUS); return yy::parser::token::MINUS; }
^
token.l:14:3: error: use of undeclared identifier 'yylval'
{ yylval->emplace<int>(yy::parser::token::TIMES); return yy::parser::token::TIMES; }
^
token.l:15:3: error: use of undeclared identifier 'yylval'
{ yylval->emplace<int>(yy::parser::token::DIVIDE); return yy::parser::token::DIVIDE; }
^
token.l:16:3: error: use of undeclared identifier 'yylval'
{ yylval->emplace<int>(yy::parser::token::LPAREN); return yy::parser::token::LPAREN; }
^
token.l:17:3: error: use of undeclared identifier 'yylval'
{ yylval->emplace<int>(yy::parser::token::RPAREN); return yy::parser::token::RPAREN; }
^
token.l:18:3: error: use of undeclared identifier 'yylval'
{ yylval->emplace<int>(yy::parser::token::SEMICOLON); return yy::parser::token::SEMICOLON; }
^
token.l:19:3: error: use of undeclared identifier 'yylval'
{ yylval->emplace<int>(yy::parser::token::EQUAL); return yy::parser::token::EQUAL; }
^
token.l:21:3: error: use of undeclared identifier 'yylval'
{ yylval->emplace<std::string>(yytext); return yy::parser::token::ID; }
^
token.l:21:32: error: use of undeclared identifier 'yytext'
{ yylval->emplace<std::string>(yytext); return yy::parser::token::ID; }
^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.
make: *** [token.yy.o] Error 1
Would you please help me solve these issues ?
Upvotes: 1
Views: 1366
Reputation: 3361
Well, I read bison manual again and solve the issue myself...
Here in bison c++ example, we could see the yylex
declaration is redefined:
// Give Flex the prototype of yylex we want ...
# define YY_DECL \
yy::parser::symbol_type yylex (driver& drv)
// ... and declare it for the parser's sake.
YY_DECL;
That's why we could write some like below in flex rule:
return yy::parser::make_MINUS (loc);
Upvotes: 3