Reputation: 75
I have got a problem understanding flex yyunput behavior.
I want to put back some charackters
For exemple: My scanner found CALL{space}{cc}
cc N?Z|N?C|P[OE]?|M
%%
CALL{blank}{cc} {BEGIN CON; return yy::ez80asm_parser::make_CALL(loc);}
CALL{mmode}{blank}{cc} {BEGIN CON; return yy::ez80asm_parser::make_CALL(loc);}
CALL {BEGIN ARG; return yy::ez80asm_parser::make_CALL(loc);}
and I want to give back the {cc} so it will be scanned next time.
What are the both arguments of yyunput has to be? I couldn't found any helpfully information about that funktion.
Any hints are wellcome Jürgen
Upvotes: 1
Views: 3178
Reputation: 241771
You can't "give back the {cc}" because the regular expression doesn't have pieces. (Flex does not do captures, either, so it wouldn't help to put parentheses around it.)
If you just want to rescan part of a token, it is much better to use yyless
than unput
, since yyless
mostly just changes a pointer. With a single call to yyless
you can return as many characters as you like, so you only need to know how many characters to return. (More precisely, you tell it how many characters you want to keep in yytext
; the remainder are returned and yytext
is truncated accordingly.)
For reference, unput
is a macro whose single argument is a single character which will be pushed onto the beginning of the unconsumed input, overwriting yytext
as it goes. (In the C++ API, it calls the internal member function ::yyunput
, supplying it an additional necessary argument. Don't call this function directly.)
If you need to push several characters onto the input, you need to unput
them one at a time, starting with the last one. Since unput
destroys the value of yytext
, you need to make sure that you've already copied it if you need it before calling unput
.
In your case, I think neither of these is appropriate. What you probably want to do is to not include the {cc}
pattern in match in the first place, which you can do with flex's trailing context operator /. (That assumes that you don't need to include the characters matched by {cc}
in the semantic value you will be returning; in the example provided, yytext
does not appear to be part of the semantic value, so the assumption should be safe.) To do so, you might write something like:
CALL{mmode}?{blank}/{cc} {BEGIN CON; return yy::ez80asm_parser::make_CALL(loc);}
CALL {BEGIN ARG; return yy::ez80asm_parser::make_CALL(loc);}
(Note: I combined your first two patterns into a single one since they seem to have the same action, but if you actually need the characters matched by {mmode}
you might not want to do that.)
If that doesn't work, for whatever reason, use yyless
. You'll need to know how many characters you want to return to the input, so I imagine you would end up with something like:
CALL{mmode}?{blank}{cc} { BEGIN CON;
int to_keep = yyleng - 1;
switch (yytext[to_keep]) {
case 'C': case 'Z':
if (yytext[to_keep - 1] == 'N') --to_keep;
break;
case 'E': case 'O': --to_keep; break
case 'P': case 'N': break;
default: assert(false); /* internal error */
}
yyless(to_keep);
return yy::ez80asm_parser::make_CALL(loc);
}
For details on the trailing context operator, see the Flex manual section on patterns (search for the word "trailing"; there is an important note towards the end as well) as well as the first paragraph of the following chapter on matching. yyless
and unput
are both documented in the chapter on actions, which includes examples of their usage.
Upvotes: 2