dylhunn
dylhunn

Reputation: 1432

Parsing with Bison: constructor in action

I am trying to construct a parser with Bison. I have the following in the first section:

%union {
    int ttype;
    // enums used in lexer
    Staff stafftype;
    Numeral numeral;
    Quality quality;
    Inversion inversion;
    Pitch pitch;
    Accidental accidental;
    // Classes used in parser
    Roman roman;
}

%token <stafftype> STAFFTYPE
%token <numeral> NUMERAL
%token <quality> QUALITY
%token <inversion> INVERSION
%token <pitch> PITCH
%token <accidental> ACCIDENTAL
%token <ttype> COLON
%token <ttype> SLASH
%token <ttype> COMMA

%type <roman> accidentalRoman

With some grammar rules. Here is one:

accidentalRoman
    : NUMERAL { $$ = Roman($1); }
    | ACCIDENTAL NUMERAL { $$ = Roman($2, $1); }
    ;

I basically have three related questions.

  1. What does the %union really represent? I thought it represented types the lexer could return. My lexer rules contain statements like return STAFFTYPE, to indicate that I have populated yylval.stafftype with a Staff object. Fair enough. However;
  2. the union also seems to have something to do with the $$ = statements in the grammar actions. Why do the result types of grammar actions need to be in the union?
  3. In my example, the Roman class has a constructor with parameters. However, declaration in the union causes the error no matching function for call to 'Roman::Roman()'. Is there any way around this? I'm trying to build up a parse tree with $$ =, and the nodes in the tree definitely need parameters in their constructors. In fact, it doesn't even allow a 0-parameter constructor: error: union member 'YYSTYPE::roman' with non-trivial 'Roman::Roman().

Upvotes: 0

Views: 537

Answers (1)

user207421
user207421

Reputation: 311039

  1. What does the %union really represent? I thought it represented types the lexer could return.

No. It represents types that productions can return, via $$ =. The lexer just returns integer constants defined via %token directives. The lexer can populate a yylval member as a side effect, but it isn't a return type of the lexer in any sense.

My lexer rules contain statements like return STAFFTYPE, to indicate that I have populated yylval.stafftype with a Staff object.

They shouldn't. They should return token types as used in the grammar, and you shouldn't usually have put anything into yylval except in the case of literals. You're doing work in the lexer that the parser should do.

  1. the union also seems to have something to do with the $$ = statements in the grammar actions. Why do the result types of grammar actions need to be in the union?

Because that's where they are placed. On top of the stack of yylval values.

  1. In my example, the Roman class has a constructor with parameters. However, declaration in the union causes the error no matching function for call to 'Roman::Roman()'. Is there any way around this? I'm trying to build up a parse tree with $$ =, and the nodes in the tree definitely need parameters in their constructors. In fact, it doesn't even allow a 0-parameter constructor: error: union member YYSTYPE::roman with non-trivial Roman::Roman().

In general the%union should consist of ints, doubles, other primitive types, and pointers. Objects in unions are problematic anyway, and on a parser stack are mostly a massive waste of space.

Upvotes: 1

Related Questions