Sergey Zubarev
Sergey Zubarev

Reputation: 63

Write antlr grammar for custom script

I'm migrating from old script engine to PHP with antlr help.

Script variables surrounded with $. For example: $myvar$. Script function calls: $func($func2('bla-bla'),123)

This just injected into html without any specific characters. For example:

<h1>$myvar$</h1> 

will displayed as

<h1>Sergey</h1>   

My grammar is following:

grammar ScriptParser;

options{
output = AST;
ASTLabelType = CommonTree;
}

tokens{
FUNC_CALL;
VAR;
RAW_OUTPUT;
}

program :   stmt*
;

stmt
:   varAtom
    |
    WHAT PLACE HERE???
    ;

expr
options{
    backtrack=true;
    }
    :   orExpr
;

orExpr  :   andExpr (('or'|'||')^ andExpr)*
;

andExpr :   equalityExpr (('and'|'&&')^ equalityExpr)*
;

equalityExpr
    :   comparisonExpr (('=='|'!='|'<>'|'=')^ comparisonExpr)*
;

comparisonExpr
    :   additiveExpr (('>'|'<'|'<='|'>=')^ additiveExpr)*
;

additiveExpr
    :   multiplicativeExpr (('+'|'-')^ multiplicativeExpr)*
;

multiplicativeExpr
    :   notExpr (('*'|'/')^ notExpr)*
;

notExpr
    :   (op='!'|'not')? negationExpr
;

negationExpr
    :   (op='-')? primary
;

primary :   atom
        |'(' expr ')'
;


atom    :   ID
        | varAtom
        | NUMBER
        | HTML_SYMBOL
        | stringAtom
;

varAtom :   '$'ID'(' exprList ')' ->^(FUNC_CALL ID exprList?)
        | '$'ID'$' ->^(VAR ID)
;


stringAtom
    : DOUBLEQUOT! ( ESC_SEQ | ~('\\'|DOUBLEQUOT) )* DOUBLEQUOT!
        | QUOT! ( ESC_SEQ | ~('\\'|QUOT) )* QUOT!
    ;

exprList:   (expr (',' expr)*)?
;


//Begin Lexer

ID  :   CHAR(LCHAR|DIGIT|CHAR)* 
;
    fragment LCHAR
    :   CHAR|'_'
;
    fragment CHAR
    :   LC|UC
;
fragment LC
    :   'a'..'z'|'а'..'я'
    ;
fragment UC
    :   'A'..'Z'|'А'..'Я'
;

HTML_SYMBOL
    :   '&'LC*';'
    ;

fragment ESC_SEQ
    :   '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
;
QUOT    :   '\''
;
DOUBLEQUOT  :   '"'
;


NUMBER  :   INT | FLOAT
;

fragment INT    :   '0'|('1'..'9' DIGIT*)
;
fragment FLOAT  :   INT('.' DIGIT*)
;

fragment DIGIT
    :   '0'..'9'
;

What should I place in stmt rule to mark all not script related items as raw output?

    stmt
    :   varAtom
        |
        WHAT PLACE HERE???
        ;

Upvotes: 1

Views: 194

Answers (1)

Bart Kiers
Bart Kiers

Reputation: 170138

Try adding the following lexer rule at the end of all other lexer rules:

OTHER
 : .
 ;

and then add this lexer rule to your stmt rule:

stmt
 : varAtom
 | OTHER
 ;

Upvotes: 1

Related Questions