Assignment grammar conflicting with λ-calculus application grammar

Question

I am implementing an extended λ-calculus interpreter using ANTLR4 and it's C++ target. Here is the language grammar:

grammar lambda;

program: expression|;

expression:
    (Int | Bool)                                # literal
    | Identifier                                # variable
    | expression expression                     # application
    | Lambda Identifier '.' expression          # abstraction
    | Identifier '=' expression                 # assign
    | condition                                 # conditional
    | Operator expression expression            # binaryExpression
    | 'print' expression                        # printInstruction
    | '(' expression ')'                        # brackets;

body: expression;
condition: 'if' expression 'then' body 'else' body
    | '(' expression '->' body '|' body;

Lambda: '\' | 'λ';
Bool : 'tru' | 'fls' | 'true' | 'false';
Int: [0-9]+;
Identifier: ('a' ..'z') ('a' ..'z' | '0' ..'9')*;
Operator:
    '+'
    | '-'
    | '*'
    | '/'
    | '<'
    | '>'
    | '<='
    | '>='
    | '==';

WS: [ 
	
]+ -> skip;

I am constructing an AST using the visitor model, which will be evaluated separately. I have encountered an issue with the way ANTLR is parsing the input, and I'm not really sure even what to call it.

Issue 1

// incorrect_association.lambda

y = 1
x = 1

Assignment ( y = ( Application ( Literal ( 1 ) ) ( Assignment ( x = ( Literal ( 1 ) ) ) ) ) )

The AST should be

Assignment ( y = ( Literal ( 1 ) )
Assignment ( x = ( Literal ( 1 ) )

or

Grouping (
    Assignment ( y = ( Literal ( 1 ) ),
    Assignment ( x = ( Literal ( 1 ) )
)

Issue 2

I suppose this may tie into the first issue: expressions across multiple lines are being read as Application expressions.

// incorrect_application.lambda

x = 1
print x

Assignment ( x = ( Application ( Literal ( 1 ) ) ( PrintInstruction ( Identifier ( "x" ) ) ) ) )

The AST should be

Assignment ( x = ( Literal ( 1 ) )
PrintInstruction ( Identifier ( "x" ) )

or

Grouping (
    Assignment ( x = ( Literal ( 1 ) ),
    PrintInstruction ( Identifier ( "x" ) )
)

I am trying to have imperative-like constant assignments, with a functional-like execution. Eventually, the program should just be whatever main = ... (like Haskell). Is it possible to prevent the Application rule from matching two expressions that are on different lines, but continue to allow any other whitespace and parens?

Possible Solution

I was contemplating writing a preprocessor that would just throw semicolons on each line ending. I may need to do this anyway because I plan on adding

imports: 'import' Identifier | '(' imports ')';

as a grammar rule, and haven't found a nice solution for handling imports with ANTLR. If I were to go this route, how would I include ; line endings in my grammar?

PS: I am very new to ANTLR, so any guidance would be super helpful.

rici · Accepted Answer

If you want newlines to be significant, then let them pass through the lexical scanner.

WS: [ 	
]+ -> skip;
NL: [
];

Then you could define a program as a sequence of expressions terminated with newline characters:

program: ( expression NL )*;

If you want semicolons to work as well, just change the definition of NL:

NL: [
;];

You'll also want to change body to accept multiple expressions, although it's not clear to me what kind of punctuation you will want to use. It's possible that

body: expression (NL expression)*;

will work for you, but it might produce unexpected results.

Your application syntax is highly ambiguous. I have no idea what Antlr will make of it, but I can't interpret it. If you have

+ a b c

That must be one of:

(+ a b) (c)
(+ a (b c))
(+ (a b) c)

But I don't see any indication of which of the three should be prefered. I would think that you would need to come up with a grammar which has more precise precedence.

(There is a reason Lisp and Scheme use parentheses :-) )

Assignment grammar conflicting with λ-calculus application grammar

Issue 1

Issue 2

Possible Solution

Answers (1)

Related Questions