Jesse Hamlin-Navias
Jesse Hamlin-Navias

Reputation: 41

Why is my Parser written in Racket Brag exceeding 128 MB memory limit

I'm writing an esolang I designed (called RifL) in Racket using the beautiful racket textbook. I'm 99% done, and was testing RifL by writing a larger program in it. When the RifL program got large enough, running it became very very slow, and eventually, running it crashed the evaluation because the program ran out of memory.

I have identified that the problem is coming from the parser I built, which is written in brag. My original parser looks like so:

#lang brag
RifL-program: \[convert-to-deck\] (/NEWLINE \[convert-to-deck\])\*
convert-to-deck: convert-to-name /DIVIDER convert-to-stack
convert-to-name: ((S-PIP-CARD \[/COMMA\])\* S-PIP-CARD) |
                 ((C-PIP-CARD \[/COMMA\])\* C-PIP-CARD) |
                 ((H-PIP-CARD \[/COMMA\])\* H-PIP-CARD) |
                 ((D-PIP-CARD \[/COMMA\])\* D-PIP-CARD)
convert-to-stack: (entry\* /NEWLINE)\* entry\*
entry: (S-PIP-CARD | C-PIP-CARD | H-PIP-CARD | D-PIP-CARD| ROYAL-CARD | JOKER | FACE-DOWN) \[/COMMA\]

I've cut off as many of the processes needed to run RifL, to isolate the problem. Below is a link to a git that has a tokenizer, a lexer, a parser, and a tester file. You will need the beautiful racket package and brag package installed in racket to run the tester.

https://github.com/Jesse-Hamlin-Navias/Rifl-parser-fail

If you run the tester file in racket with a 128 MB limit, it should succeed on parsing the first code example, and exceed memory on the second. In these files I provided, the parser has been simplified to:

RifL-program: [anything] (/NEWLINE [anything])*
@anything: (S-PIP-CARD | C-PIP-CARD | H-PIP-CARD | D-PIP-CARD| ROYAL-CARD | JOKER | FACE-DOWN
               | /COMMA | /NEWLINE | /DIVIDER)*

This is the minimum complexity of the Parser grammar needed to exceed the memory limit. This alarms me, as it is such a simple grammar. My question is, am I doing something wrong, and if so, what? Is the problem in the tokenizer or lexer? If I'm not doing something wrong, does brag just suck? And if brag sucks, what can I replace it with, that doesn't require me to refactor my whole executer, and doesn't force me to build a full parser from scratch?

Matthew Butt­erick or Greg Hendershott, if you're listening, please save me from writing a full parser.

Upvotes: 1

Views: 98

Answers (1)

Jesse Hamlin-Navias
Jesse Hamlin-Navias

Reputation: 41

Left right recursion!

It turns out its pretty important. Also removing all the * was also integral. The parser is still slow with large code, and can crash, but it crashes way less, so a big improvement. I think brag is probably not built well, and could be improved, but approaching the grammar right is important.

#lang brag
RifL-program: () | convert-to-deck | RifL-program-sub [/NEWLINE] [convert-to-deck]
@RifL-program-sub: [convert-to-deck] | RifL-program-sub [/NEWLINE] [convert-to-deck]
convert-to-deck: convert-to-name /DIVIDER convert-to-stack
convert-to-name: [(s-name [/COMMA])] S-PIP-CARD | [(c-name [/COMMA])] C-PIP-CARD |
                 [(h-name [/COMMA])] H-PIP-CARD | [(d-name [/COMMA])] D-PIP-CARD
@s-name: [(s-name [/COMMA])] S-PIP-CARD
@c-name: [(c-name [/COMMA])] C-PIP-CARD
@h-name: [(h-name [/COMMA])] H-PIP-CARD
@d-name: [(d-name [/COMMA])] D-PIP-CARD
convert-to-stack: [(sub-stack [/COMMA])] [/NEWLINE] [entry]
@sub-stack: [(sub-stack [/COMMA])] [/NEWLINE] [entry]
entry: (S-PIP-CARD | C-PIP-CARD | H-PIP-CARD | D-PIP-CARD| ROYAL-CARD | JOKER | FACE-DOWN)

Upvotes: 2

Related Questions