Reputation: 675
I am building a JavaScript instrumentor with ANTLR, using the Patrick Hulsmeijer EcmaScript 3 grammar.
I'm having a problem parsing this line of code:
function(){}();
that is a direct call of a function expression. The parser recognizes the statement as a function declaration and then fails when it finds the parentheses after the function body. The reason is that function declarations are recognized with most precedence to avoid the ambiguity with function expressions.
This is how the grammar recognizes function declarations:
sourceElement
options
{
k = 1 ;
}
: { input.LA(1) == FUNCTION }? functionDeclaration
| statement
;
I am not even sure that it is a valid EcmaScript statement. Is it?
I think it should be more correct to write:
(function(){})();
which is actually well handled by the parser.
By the way this is not the core of the question, because I have no control over the code to instrument.
I tried to eliminate functionDeclaration
from the sourceElement
production and to put it in the statement
statementTail
production:
statementTail
: variableStatement
| emptyStatement
| expressionStatement
| functionDeclaration
| ifStatement
| ...
;
But a build error arises:
[fatal] rule
statementTail
has non-LL(*) decision due to recursive rule invocations reachable from alts 3,4. Resolve by left-factoring or using syntactic predicates or usingbacktrack=true
option.
|---> : variableStatement
because the variableStatement
production contains functionExpression
as a descendant, which leads to an ambiguity. The parser cannot choose among functionDeclaration
and functionExpression
because they are almost equal:
functionDeclaration
: FUNCTION name=Identifier formalParameterList functionBody
-> ^( FUNCTIONDECL $name formalParameterList functionBody )
;
functionExpression
: FUNCTION name=Identifier? formalParameterList functionBody
-> ^( FUNCTIONEXPR $name? formalParameterList functionBody )
;
Note: I modified the original rewrite rules using different tree nodes (FUNCTIONDECL and FUNCTIONEXPR) because I need it while walking the AST.
How can I solve this ambiguity?
Upvotes: 1
Views: 2839
Reputation: 5256
The parser is right to expect a functionDeclaration, when a sourceElement begins with the 'function' keyword. This in fact implements the following restriction from the ECMAScript Language Specification:
an ExpressionStatement cannot start with the function keyword because that might make it ambiguous with a FunctionDeclaration.
The statement in question thus is invalid per the above restriction, though in fact it is not ambiguous by productions of the grammar: as it omits the function identifier, it cannot be a functionDeclaration. A statement exposing the syntactic ambiguity would be
function f(){}(42)
which according to the ECMAScript spec is a functionDeclaration, followed by an expressionStatement.
So the best thing to do is ask the provider of this code for correct syntax. You were saying that you need to parse it anyway, and that could possibly be done using ANTLR's backtracking. Make sure the function identifier is mandatory in the functionDeclaration, and have it try a functionDeclaration before a statement. But be aware that, even if this helps for the original statement, it will fail for
function f(){}()
because here the functionDeclaration can be completed successfully, but there is no valid statement following it.
Upvotes: 2