mpen
mpen

Reputation: 283043

Dynamic parser - read tokens from a separate file

Let's say I want to parse my new language that looks like this:

main.mylang

import "tags.mylang"
cat dog bacon

And there's another file tags.mylang that looks like this:

cat "meow"
dog "woof"
bacon "sizzle"

Running main.mylang would output

meow woof sizzle

The problem I'm having is that "cat", "dog", and "bacon" are defined in a separate file, as implemented my the mylang developer; i.e., I can't make them part of the grammar beforehand.

Is it possible to dynamically add these tags into the grammar as it's parsing? I don't want to add a wildcard \w+ or something because I want it to error on unrecognized tags.

Edit: I'm writing this using jison, which is based on bison.

Upvotes: 0

Views: 717

Answers (2)

rici
rici

Reputation: 241861

I'll assume that tags all match the pattern for variables, whatever pattern that might be. (\a\w*, maybe). Define a dictionary whose keys are tags; the value can be whatever you want to associate with the tag. As I understand it, you can make this dictionary available to both the parser and the lexer by putting it inside the object parser.yy.

The lexer rule for variables would be something like this (I don't know much about jison, so this is based on bison+flex):

{variable}    if (yytext in yy.tags) { return TAG; } else { return VARIABLE; }

If you wanted to have different token types for different tags, (perhaps because tags are aliases for grammatical concepts, or something like that), you could store the token type in the tag dictionary, so that you could return it from the lexer.

In the grammar for tag definition files, you could add a tag definition simply by adding the key and appropriate value to yy.tags.

Upvotes: 2

David Gorsline
David Gorsline

Reputation: 5018

You can go with the wildcard match \w+ that you suggest, then use the YYERROR macro to raise your own syntax error when your parser's semantic logic detects an unrecognized/undefined tag.

Upvotes: 1

Related Questions