Xzhsh
Xzhsh

Reputation: 2239

Is it possible to parse a string of fixed length in yacc/lex?

I have a file format something like this

...
{string_length} {binary_string}
...

example:

...
10 abcdefghij
...

Is this possible to parse using lexer/yacc? There is no null terminator for the string, so I'm at a loss of how to tokenize that.

I'm currently using ply's lexer and yacc for this

Upvotes: 0

Views: 239

Answers (1)

rici
rici

Reputation: 241921

You can't do it with a regular expression, but you can certainly extract the lexeme. You're not specific about how the length is terminated; here, I'm assuming that it is terminated by a single space character. I'm also assuming that yylval has some appropriate struct type:

[[:digit:]]+" "  { unsigned long len = atol(yytext);
                   yylval.str = malloc(len);
                   yylval.len = len;
                   for (char *p = yylval.str; len; --len, ++p) {
                     int ch = input();
                     if (ch == EOF) { /* handle the lexical error */ }
                     *p = ch;
                   }
                   return BINARY_STRING;
                 }

There are other solutions (a start condition and a state variable for the count, for example), but I think the above is the simplest.

Upvotes: 1

Related Questions