kjh
kjh

Reputation: 3405

Compilers: How to parse function calls and function definitions

Upfront, I'm making an interpreter in Python, not an actual compiler that compiles to machine code. I've been skimming quite a few compiler construction guides lately and I understand the basics of generating tokens for your parser and building a syntax tree to evaluate arithmetic expressions, but I don't quite understand how to parse expressions with function calls inside them, things like

Fig. (a)

1 + pow(1, 1)

or how to parse lines when the user is defining a function like this

Fig. (b)

function newFunction( someArgs ){
   ... some magic ...
}

In Fig. (a), how should I tokenize this expression? After reading the reserved word "pow" should I grab everything up to the closing parenthesis and pass that to a parser? or should I include "pow", "(", "1", "1", and ")" each as seperate tokens and add them to my parse tree?

In Fib. (b) I don't have any idea where to even start when it comes to compiling function definitions. Any information to put me in the right direction would be appreciated.

Edit: I'm using a Backus-Naur form grammar:

S ::= expression

expression ::= term | term ([+-] term)+

term ::= factor | factor ([*/] factor)+

factor ::= number | ( expression )

number ::= [0-9]+

Upvotes: 5

Views: 4866

Answers (2)

Ira Baxter
Ira Baxter

Reputation: 95324

Another poster suggests adding the names of functions to your grammar.

That works for toy languages, but not for practical ones where there may be huge libraries as well as large sets of user defined functions.

You can handle the latter by adding function calls to the BNF, in a way that leaves the function name out of the grammar:

S ::= expression

expression ::= term | term ([+-] term)+

term ::= factor | factor ([*/] factor)+

factor ::= number | ( expression ) | identifier |  functioncall

functioncall ::= identifier [(]  arguments [)]

arguments ::= empty | arguments 

arguments ::= expression |  arguments [,] expression

number ::= [0-9]+

identifier ::= [a-z]+

Now your parser can pick up function calls. (Left out of the grammer is a way to define functions... but that's just more syntax which I leave it for you to add).

The price for this is that after parsing, something has decide for each function name, precisely which bit of code it represents. You will need a symbol table to do this. That's another question.

Upvotes: 3

user764357
user764357

Reputation:

Hopefully there are going to be a few good anaswers to this quesiton, but the first things I'd advice you to do to help improve the quality of the questionare:

  1. Limit your scope for your language and
  2. Define your grammar.

Look into Backus–Naur Form, and learn about how to define the grammar your language is going to support.

In the first instance, look at writing a simple parser for a handful of mathematic functions to get your head around it.

<digit>   ::= "0"|"1"|...|"9"
<integer> ::= <digit>*
<expr>    ::= <integer> | <add> | <pow>
<add>     ::= "add(" <expr> "," <expr> ")"
<sub>     ::= "sub(" <expr> "," <expr> ")"
<pow>     ::= "pow(" <expr> "," <expr> ")"

From a formal grammar like this you can then have some surety around what is or isn't a valid expression.

For example: add(1,2) and pow(2,add(2,1)) are both valid where as add(1) isn't as the add grammar needs two expressions.

Upvotes: 1

Related Questions