Reputation: 69
I am new to Lex & Yacc programming and just getting started with grammar. I cam across this program on the net and have been trying to understand it. Following are the Lex and Yacc code-snippets for a simple calculator :
Lex rule:
%%
[a-z] {
yylval = *yytext - 'a';
return VARIABLE;
}
[0-9]+ {
yylval = atoi(yytext);
return INTEGER;
}
[-+()=/*\n] { return *yytext; }
[ \t] ;
. yyerror("invalid character");
%%
YACC grammar:
%%
program:
program statement '\n'
|
;
statement:
expr { printf("%d\n", $1); }
| VARIABLE '=' expr { sym[$1] = $3; }
;
expr:
INTEGER
| VARIABLE { $$ = sym[$1]; }
| expr '+' expr { $$ = $1 + $3; }
| expr '-' expr { $$ = $1 - $3; }
| expr '*' expr { $$ = $1 * $3; }
| expr '/' expr { $$ = $1 / $3; }
| '(' expr ')' { $$ = $2; }
;
%%
Can anyone please help me understand step by step how the input expression x = 3 + 4
will be processed/parsed ?
As per my understanding, while processing the input, 'x
' will return VARIABLE whereas 3
, 5
and 4
will be returned as INTEGER by Lex. However, in Yacc, as per the grammar, since a VARIABLE could be deduced as expr, the expression will become: expr = expr '+' expr
So how will this be reduced to get VARIABLE '=' expr { sym[$1] = $3; }
?
Any help is appreciated.
Thanks
Upvotes: 1
Views: 2823
Reputation: 51
Here, i tried to give line by line explanation of your lex rule section. In next post, i will try to give the explanation of your YACC grammar section.
//----------LEX Rule Section---------- pattern <-> {co-responding action}
// yytext :
// What actualy yytext does is: it acts like a pointer which points the location of current charchter reading on
// as i said, this yytext is pointer for lex/flex file . So ,to access the value of yytext . We need * before it.
// yylaval:
// it pass the semantic value of a token from the lexer(lex/flex file or .l file) to the parser(yacc/bison file or .y file)
// Or you can say this yylval returns value associated token
// yyerror()
// when yyparse() is called it automatically invoke yyerror() when any syntax error arise
%%
[a-z] {/* Consider a case where user is is giving input as 'f-a' or 'b+c' . As this is calculator programme So we are handling this case by converting these input as ascii value */
yylval = *yytext - 'a'; /* Current value of yytext minus ascii value 'a' is stored into yylval */
return VARIABLE; /* here VARIABLE specified non-terminal for .y file */
}
[0-9]+ {
yylval = atoi(yytext); /* atoi() is a lirary function of stdlib.h . it converts the string argument to an integer value*/
return INTEGER; /* here also INTEGER specified non-terminal for .y file */
}
[-+()=*/] { return *yytext; } /* inside [] bracket, elements are considered as charchter set,if any element matched then fire the action section
[ \t] {;} /* if any space or tab encounters then here action section is to do nothing */
. { yyerror("invalid character"); } /* actually . means all charchters except new line */
/* But here we introduced . at the end of rule section,So this . means all other charchters except above specified pattern and except new line */
%%
Upvotes: 0
Reputation: 311039
x
is shifted; =
is shifted; 3,+,4
are shifted; then 3+4
matches the addition production, which is reduced to expr
, which then allows the assignment production to be reduced. You need to remember that yacc is a bottom-up parser.
Upvotes: 2