mentics
mentics

Reputation: 6999

Handling iteration count fields in a parser generator

In a typical parser generator situation (eg. ANTLR or Beaver), how might one handle the following situation:

0051A2B3C4D5E
0031G2T3H

Where it's a 3 character numeric field that tells you how many iterations of a repeating field comes after it.

I know there are possibilities of post processing, but won't be useful in some cases, so I'm trying to find if there is some way for the parser to handle it. It would be acceptable if there was a solution that involves interacting with the parser when it reads in the numeric field--somehow telling it to go read in the next N items based on a certain production.

Upvotes: 1

Views: 78

Answers (1)

Bart Kiers
Bart Kiers

Reputation: 170188

Whether this is possible depends on the parser generator.

You lexer will need to be aware of its surroundings (context sensitive). You'll only want to create a Num token at the start of a line. In ANTLR, you can do that by adding the predicate getCharPositionInLine()==0 in front of the Num rule.

And then in your parser rule, line, you need keep consuming Block tokens (your double chars) as long as the counter is more than zero (the counter being the value of Num).

A quick ANTLR demo:

grammar T;  

parse
 : line* EOF 
 ;

line
@init{int n = 0;}
 : Num {n = Integer.valueOf($Num.text);} ({n > 0}?=> Block {n--;})*
 ;

Num
 : {getCharPositionInLine()==0}?=> Digit Digit Digit
 ;

Block
 : AlphaNum AlphaNum
 ;

Space
 : (' ' | '\t' | '\r' | '\n')+ {skip();}
 ;

fragment Digit : '0'..'9';
fragment Letter : 'a'..'z' | 'A'..'Z';
fragment AlphaNum : Letter | Digit;

would parse your input:

0051A2B3C4D5E
0031G2T3H

as follows:

enter image description here

Upvotes: 2

Related Questions