Reputation: 145
I am trying to write a simple parser using Lex and Yacc. And I am not familiar with these two before. When I finish the lex and yacc file, and compile it I got error. I think the error is related to string head files that are not included properly, but I couldn't figure it out by myself.
The Lex file named "tokens.l":
%{
#include "parser.hpp"
%}
MODEL "model"
PORT "input"|"output"|"intern"
GATE "xor"|"and"|"or"|"buf"|"cmos1"|"dff"|"dlat"|"inv"|"mux"|"nand"|"nor"|"tie0"|"tie1"|"tiex"|"tiez"|"tsh"|"tsl"|"tsli"|"xnor"
INSTNAME [A-Z0-9]+
PRIMITIVE "primitive"
LEFT "("
RIGHT ")"
COMMA ","
SEMICOLON ";"
EQUAL "="
BLANK [ \t\n]+
%%
{MODEL} {return MODEL;}
{PORT} { if (yytext == "input")
return INPUT;
else if (yytext == "output")
return OUTPUT;
else
return INTERN;
}
_{GATE} {return GATE;}
{INSTNAME} {return INSTNAME;}
{PRIMITIVE} {return PRIMITIVE;}
{LEFT} {return LEFT;}
{RIGHT} {return RIGHT;}
{COMMA} {return COMMA;}
{SEMICOLON} {return SEMICOLON;}
{EQUAL} {return EQUAL;}
{BLANK} {;}
"\0" {return END;}
%%
The yacc file named "parser.y":
%{
#include <iostream>
#include <string>
#include <cstdio>
extern FILE *fp;
%}
%union{
std::string* str;
}
%token <str> MODEL
%token <str> INPUT
%token <str> OUTPUT
%token <str> INTERN
%token <str> GATE
%token <str> INSTNAME
%token PRIMITIVE
%token LEFT
%token RIGHT
%token COMMA
%token SEMICOLON
%token EQUAL
%token END
%type <str> vfile modules module params param interngates interngate primitives
%%
vfile : modules END {
std::ofstream fp;
fp.open("output.v");
fp<<$1;
fp.close();
$$ = new std::string("success");
std::cout<<$$;
}
modules : modules module {$$=$1+$2;}
| module {$$=$1;}
module :MODEL INSTNAME LEFT params RIGHT LEFT interngates RIGHT
{$$ = "module "+$2+" ("+$4+");\n"+$7+"endmodule\n";}
interngates :interngates interngate {$$=$1+$2+"\n";}
|interngate {$$=$1+"\n";}
interngate :INPUT LEFT params RIGHT primitives {$$=$1+$3+"\n"+$5;}
| OUTPUT LEFT params RIGHT primitives { $$=$1+$3+"\n"+$5;}
| INTERN LEFT params RIGHT primitives {$$="wire"+$3+"\n"+$5;}
primitives :LEFT RIGHT {$$="";}
|LEFT PRIMITIVE EQUAL GATE INSTNAME params SEMICOLON RIGHT
{$$=$4+" "+$5+" ("+$6+");\n";}
params :params COMMA param {$$=$1+","+$3;}
| param {$$=$1;}
param :INSTNAME {$$=$1;}
%%
To compile the file, I use the command below:
bison -d -o parser.cpp parser.y
lex -o tokens.cpp tokens.l
g++ -o myparser tokens.cpp parser.cpp -lfl
Can anybody give me a clue? Thanks a lot!
Updated: Error report on osx. http://www.edaplayground.com/x/3HL
Upvotes: 1
Views: 2478
Reputation: 20842
You can't use automatic storage for C++ std::string (or any other string class with non-trivial constructor) in %union. You'll need to use dynamic (heap).
Instead of
%union {
string str;
}
Try:
%union {
std::string *str;
}
You will need to change all of the uses of yylval->str or $$, $1, etc. where $N %type is to use dynamically allocated strings.
So instead of
$$ = "success";
You have to do:
$$ = new std::string("success");
It is customary to use pointers in yacc/bison parser YYSTYPE %union anyway to avoid a huge amount of copying on the stack. Keep in mind your productions should take care of freeing strings for tokens or non-terminals that are no longer used unless your parser runtime is short-lived and the source files aren't huge, then you can cheat and just avoid freeing them or use garbage collection.
It is possible to redefine YYSTYPE to a regular string (non-pointer), but you lose the ability to use the union, which most non-trivial parsers need to pass up a mix of tokens or typed AST objects in semantic actions. Constraining your productions to a single type is less useful than void *.
It is also possible to redefine YYSTYPE to use a variant / polymorphic type, or use a multi-member struct (poor substitution for variant). The former defeats the purpose of the "type safe" %type and %token macros, and the latter forces you to remember the type of each terminal or non-terminal and use explicit notation for the member of your struct ($$->str = "foo", $$->expr.left = $1->str, etc.), This is the downside to using a C based parser with C++. You may want to try Bison's C++ parser skeleton, I have little experience with it due to compile errors everytime I tried it over the years.
There are other (better) workarounds that I have found; I have seen Bison patched to allow boost::variant for YYSTYPE with support of %type and %token. Google "bison Michiel de Wilde" or "bison variant YYSTYPE" (http://lists.gnu.org/archive/html/bison-patches/2007-06/msg00000.html), however, like many Bison suggestions over the years, the patches are met with some vague arguments or general discussion about alternatives, then it fizzles.
Upvotes: 3