Reputation: 21
I have a grammar in ANTLR and I have a file for testing my grammar. But I don't know what is wrong with my output.
This is my grammar:
grammar proj;
start
: (assign|define|read|write|condition|while|module|callingmodule)+
;
assign
: T_ID T_ENTESAB expentesab T_SEPARATOR
;
expentesab
: T_ID
| T_NUMBER
| T_SABETMANTEGHI
| expentesab operator expentesab
| expentesab operator
| operator expentesab
| T_PARANTEZBAZ expentesab T_PARANTEZBASTE
| expentesab T_COMMA expentesab
| T_PARANTEZBAZ expentesab T_COMMA expentesab T_PARANTEZBASTE
| expentesab T_COMMA T_PARANTEZBAZ expentesab T_PARANTEZBASTE
;
operator
: T_ADD
| T_SUB
| T_MUL
| T_DIV
| T_POW
| T_FACT
| T_AND
| T_OR
| T_XOR
;
define
: T_ID T_2POINT T_TYPE T_SEPARATOR
;
read
: T_READ expread T_SEPARATOR
;
expread
: T_ID
| T_NUMBER
| operator
| T_PARANTEZBAZ expread T_PARANTEZBASTE
| expread operator expread
;
write
: T_WRITE expwrite T_SEPARATOR
;
expwrite
: T_ID
| T_NUMBER
| operator
| T_PARANTEZBAZ expwrite T_PARANTEZBASTE
| expwrite operator expwrite
| expwrite T_COMPARE expwrite
;
condition
: T_IF expcon T_THEN code_if T_ELSE code_if T_SEPARATOR
| T_IF expcon T_THEN code_if T_SEPARATOR
;
expcon
: assign
| define
| expcon T_COMPARE expcon
| expcon operator expcon
| operator expcon
;
code_if
: condition
| block
| define
| assign
| callingmodule
| code_if operator code_if
| T_PARANTEZBAZ code_if T_PARANTEZBASTE T_SEPARATOR
| T_PARANTEZBAZ code_if operator code_if T_PARANTEZBASTE T_SEPARATOR
;
callingmodule
: T_ID T_PARANTEZBAZ params T_PARANTEZBASTE T_SEPARATOR
| T_ID T_PARANTEZBAZ T_PARANTEZBASTE T_SEPARATOR
;
params
: expparam(T_COMMA expparam)*
;
expparam
: T_ID
| shart
| T_ID operator T_NUMBER
| T_PARANTEZBAZ expparam T_PARANTEZBASTE
;
while
: T_WHILE expwhile code_while
;
expwhile
: T_SABETMANTEGHI
| T_NUMBER
| T_PARANTEZBAZ expwhile T_PARANTEZBASTE
| expwhile operator expwhile
| T_ID T_COMPARE T_ID
| expwhile T_AND expwhile
| expwhile T_OR expwhile
| expwhile T_XOR expwhile
;
code_while
: block
| module
| callingmodule
| define
| assign
| code_while operator code_while T_SEPARATOR
| T_PARANTEZBAZ code_while T_PARANTEZBASTE T_SEPARATOR
| T_PARANTEZBAZ code_while operator code_while T_PARANTEZBASTE T_SEPARATOR
;
block
: T_BEGIN inner_block T_END
;
inner_block
: define
| assign
| condition
| callingmodule
| block
| T_ID operator T_ID
| T_PARANTEZBAZ inner_block T_PARANTEZBASTE T_SEPARATOR
| T_PARANTEZBAZ T_ID operator T_ID T_PARANTEZBASTE T_SEPARATOR
;
module
: T_MODULE T_ID T_INPUT T_2POINT (define)+ T_OUTPUT T_2POINT T_TYPE block
| T_MODULE T_ID block
;
shart
: expcon T_CONDITION code_if T_2POINT code_if
| expcon T_CONDITION code_if T_2POINT code_if T_SEPARATOR
;
T_TYPE: ('s'|'S')('t'|'T')('r'|'R')('i'|'I')('n'|'N')('g'|'G')|('r'|'R')('e'|'E')('a'|'A')('l'|'L')|
('b'|'B')('o'|'O')('o'|'O')('l'|'L');
T_END: ('e'|'E')('n'|'N')('d'|'D');
T_BEGIN:('b'|'B')('e'|'E')('g'|'G')('i'|'I')('n'|'N');
T_WHILE:('w'|'W')('h'|'H')('i'|'I')('l'|'L')('e'|'E');
T_IF:('i'|'I')('f'|'F');
T_THEN:('t'|'T')('h'|'H')('e'|'E')('n'|'N');
T_ELSE:('e'|'E')('l'|'L')('s'|'S')('e'|'E');
T_READ:('r'|'R')('e'|'E')('a'|'A')('d'|'D');
T_WRITE:('w'|'W')('r'|'R')('i'|'I')('t'|'T')('e'|'E');
T_MODULE:('M'|'m')('O'|'o')('D'|'d')('U'|'u')('L'|'l')('E'|'e');
T_INPUT:('I'|'i')('N'|'n')('P'|'p')('U'|'u')('T'|'t');
T_OUTPUT:('O'|'o')('U'|'u')('T'|'t')('P'|'p')('U'|'u')('T'|'t');
T_RETURN:('R'|'r')('E'|'e')('T'|'t')('U'|'u')('R'|'r')('N'|'n');
T_SEPARATOR : ';';
T_SABETMANTEGHI: ('t'|'T')('r'|'R')('u'|'U')('e'|'E')|('f'|'F')('a'|'A')('l'|'L')('s'|'S')('e'|'E');
T_NUMBER:T_HEXNUMBER|T_INTEGERNUMBER;
T_HEXNUMBER: '0' ('x'|'X') ('0'..'9'|'a'..'f'|'A'..'F')+|'0' ('x'|'X') ('0'..'9'|'a'..'f'|'A'..'F')+ '.' ('0'..'9'|'a'..'f'|'A'..'F')+;
T_INTEGERNUMBER:(('0'..'9')+|('0'..'9')+ '.'('0'..'9')+);
T_FUNC:('F'|'f')('U'|'u')|('N'|'n')('C'|'c');
T_ADD: '+';
T_SUB: '-';
T_MUL: '*';
T_DIV: '/';
T_POW: '^';
T_FACT: '!';
T_ENTESAB:'=';
T_X:'x'|'X';
T_AND: ('a'|'A')('n'|'N')('d'|'D');
T_OR: ('o'|'O')('r'|'R');
T_NOT: ('n'|'N')('o'|'O')('t'|'T');
T_XOR: ('x'|'X')('o'|'O')('r'|'R');
T_COMPARE: '>'| '<'| '>='|'<='| '<>';
T_REMAIN: '%';
T_CONDITION:'?';
T_2POINT:':';
T_PARANTEZBAZ:'(';
T_PARANTEZBASTE:')';
T_COMMA:',';
T_COMMENT:T_COM1LINE|T_COMMULLINE;
T_COM1LINE: '%%' ~( '\t'|'\r')+ -> skip ;
T_COMMULLINE:'%%%' (.|('\t'|'\r'|' '|'\n'))*? '%%%' ->skip;
T_ID : [a-zA-Z] ([a-zA-Z]|('0'..'9'))*;
T_WS : (('\t'|'\r'|' ')+) ->skip;
T_NEWLINE:('\n')->skip;
T_LEXICALERROR:.;
And this is my input file:
%%%This is a sample Written in QUPLA $
@The program compute fibonacci serie%%%
module func
input:
X:real;
output:
i:real;
begin
if x> 0 then
begin
return Func(x-1)+func(x-2);
end
begin
return 1;
end
end
%% This is the main module &%*&()
module main
begin
i:real;
read i;
write (func(i)?1:2);
end
For this input, I have these errors:
In line 5 expecting T_ID but i have T_ID!
In line 8 expecting T_IF,T_WHILE T_READ.... But I have T_IF
Upvotes: 2
Views: 102
Reputation: 1003
Let's start with your errors.
In line 5 expecting T_ID but i have T_ID!
This error is due the fact that you have lexer rule T_X:'x'|'X';
which will match to the X
from line 5 of your sample code. X
will be match to T_X
lexem because T_X
lexem is defined before expected T_ID
lexem. The answer is: it is not a T_ID token but T_X.
In line 8 expecting T_IF,T_WHILE T_READ.... But I have T_IF
In line 7 from code example you are trying to define an output variable i:real
. But you are missing of define+
rule in an output section of a module
definition. I assume you can have named output parameter. Then proper module
rule should looks like as follow:
module
: T_MODULE T_ID T_INPUT T_2POINT define+ T_OUTPUT T_2POINT define+ T_TYPE block
| T_MODULE T_ID block
;
Because of missing define+
the definition of module
rule is interrupted and everything after output:
in line 6 is treated as definition (define
) alternative from main rule start
.
If above it's not the case and your code example is wrong then you should remove i:
characters in the output
section of the module.
Anyway, the answer is: code example is inconsistent with your grammar.
You should define your tokens in an order:
You can't use names reserved to a language you use ANTLRv4 with. You defined while
grammar rule which will raise conflict with while
keyword in Java.
Use pleasent to eye and simpler ANTLRv4 constructs:
T_WS : (('\t'|'\r'|' ')+) ->skip;
to T_WS : [ \t\r]+ -> skip;
T_ID : [a-zA-Z] ([a-zA-Z]|('0'..'9'))*;
to T_ID : [a-zA-Z] [a-zA-Z0-9]*;
T_COMMULLINE:'%%%' (.|('\t'|'\r'|' '|'\n'))*? '%%%' ->skip;
to T_COMMULLINE:'%%%' .*? '%%%' -> skip;
(the dot .
will match everything anyway, especially whitespace characters)Upvotes: 1