Reputation: 31
How can I modify my lex or yacc files to output the same input in a file? I read the statements from a file, I want to add some invariant for special statements and add it to input file and then continue statements. For example I read this file:
char mem(d);
int fun(a,b);
char a ;
The output should be like:
char mem(d);
int fun(a,b);
invariant(a>b) ;
char a;
I can't do this. I can only write the new statements to output file.
Upvotes: 1
Views: 1078
Reputation: 50190
Since you can already output your own statements, your problem is how to write out the input as it is being read in. In lex, the value of each token being read is available in the variable yytext
, so just write it out for every token you read. Depending on how your lexer is written, this could be used to echo whitespace as well.
Upvotes: 0
Reputation: 241671
It's useful to understand why this is a non-trivial question.
The goal is to
Copy the entire input to the output; and
Insert some extra information produced while parsing.
The problem is that the first of those needs to be done by the scanner (lexer), because the scanner doesn't usually pass every character through to the parser. It usually drops whitespace, comments, at least. And it may do other things, like convert numbers to their binary representation, losing the original textual representation.
But the second one obviously needs to be done by the parser, obviously. And here is the problem: the parser is (almost) always one token behind the scanner, because it needs the lookahead token to decide whether or not to reduce. Consequently, by the time a reduction action gets executed, the scanner will already have processed all the input data up to the end of the next token. If the scanner is echoing input to output, the place where the parser wants to insert data has already been output.
Two approaches suggest themselves.
First, the scanner could pass all of the input to the parser, by attaching extra data to every token. (For example, it could attach all whitespace and comments to the following token.) That's often used for syntax coloring and reformatting applications, but it can be awkward to get the tokens output in the right order, since reduction actions are effectively executed in a post-order walk.
Second, the scanner could just remember where every token is in the input file, and the parser could attach notes (such as additional output) to token locations. Then the input file could be read again and merged with the notes. Unfortunately, that requires that the input be rewindable, which would preclude parsing from a pipe, for example; a more general solution would be to copy the input into a temporary file, or even just keep it in memory if you don't expect it to be too huge.
Upvotes: 1