Wojciech Danilo
Wojciech Danilo

Reputation: 11803

Flex lexer output modification

How can I use flex lexer in C++ and modify a token's yytext value? Lets say, I have a rule like this:

"/*"    {
        char c;
        while(true)
            {
            c = yyinput();
            if(c == '\n')
                ++mylineno;

            if (c==EOF){
                yyerror( "EOF occured while processing comment" );
                break;
            }
            else if(c == '*')
                {
                if((c = yyinput()) == '/'){
                    return(tokens::COMMENT);}
                else
                    unput(c);
                }
            }
        }

And I want to get token tokens::COMMENT with value of comment between /* and */. (The bove solution gives "/*" as the value.

Additional, very important is tracking the line number, so I'm looking for solution supporting it.

EDIT Of course I can modify the yytext and yyleng values (like yytext+=1; yyleng-=1, but still I cannot solve the above problem)

Upvotes: 1

Views: 461

Answers (1)

Josh
Josh

Reputation: 1764

I still think start conditions are the right answer.

%x C_COMMENT
char *str = NULL;
void addToString(char *data)
{
    if(!str)
    { 
        str = strdup(data);
    }
    else
    {
        /* handle string concatenation */
    }
}

"/*"                       { BEGIN(C_COMMENT); }
<C_COMMENT>([^*\n\r]|(\*+([^*/\n\r])))*    { addToString(yytext); }
<C_COMMENT>[\n\r]          { /* handle tracking, add to string if desired */ }
<C_COMMENT>"*/"            { BEGIN(INITIAL); }

I used the following as references:
http://ostermiller.org/findcomment.html
https://stackoverflow.com/a/2130124/1003855

You should be able to use a similar regular expression to handle strings.

Upvotes: 1

Related Questions