Reputation: 2239
I have a file format something like this
...
{string_length} {binary_string}
...
example:
...
10 abcdefghij
...
Is this possible to parse using lexer/yacc? There is no null terminator for the string, so I'm at a loss of how to tokenize that.
I'm currently using ply's lexer and yacc for this
Upvotes: 0
Views: 239
Reputation: 241921
You can't do it with a regular expression, but you can certainly extract the lexeme. You're not specific about how the length is terminated; here, I'm assuming that it is terminated by a single space character. I'm also assuming that yylval
has some appropriate struct
type:
[[:digit:]]+" " { unsigned long len = atol(yytext);
yylval.str = malloc(len);
yylval.len = len;
for (char *p = yylval.str; len; --len, ++p) {
int ch = input();
if (ch == EOF) { /* handle the lexical error */ }
*p = ch;
}
return BINARY_STRING;
}
There are other solutions (a start condition and a state variable for the count, for example), but I think the above is the simplest.
Upvotes: 1