Reputation: 751
I came across the following problem when I try to write a grammer for certain assembly lanuage.
The example grammar file looks like.
grammar test;
stat: operation+;
operation : (add | addi);
add : 'ADD' datatype xd ',' xn;
addi : 'ADD.s64' xd ',' '#' imm;
datatype : '.s64'| '.f32';
xd : 'X0' | 'X1';
xn : 'X0' | 'X1';
imm : '0' | '1' | '2' | '3' | '4';
The grammar should be able to parser two assembly instruction
ADD: ex. ADD.s64 X1, X2 or ADD.f32 X1, X2
ADD(imm) ex. ADD.s64 X1, # X3
The problem is that because the add(imm) can only have the .s64 as the datatype. I prefer not make a separate rule for datatype of ADD(imm).
However, when i enter ADD.s64 X1, X3, the parser always match with addi, and report the error "fail to match the #".
I guess it is because the logical of parser is to find the longest match of the text. (which is 'ADD.s64').
I am want to know is there a way, I can do error recovery so that it can then try to match the correct add rules?
Upvotes: 1
Views: 93
Reputation: 807
The instruction ADD.s64 X1, X3
cannot be matched, because xn cannot be equal to X3.
Because rule add is not matched, the parser tries to match rule addi, but fails because of character '#' that is not found in intruction.
By the way, the way you wrote your grammar, addi will match pattern like ADD.s64 X1, # 3
and not ADD.s64 X1, # X3
, as wanted
Upvotes: 1