Andrew Smith
Andrew Smith

Reputation: 197

Parser Generator: How to use GPLEX and GPPG together?

After looking through posts for good C# parser generators, I stumbled across GPLEX and GPPG. I'd like to use GPLEX to generate tokens for GPPG to parse and create a tree (similar to the lex/yacc relationship). However, I can't seem to find an example on how these two interact together. With lex/yacc, lex returns tokens that are defined by yacc, and can store values in yylval. How is this done in GPLEX/GPPG (it is missing from their documentation)?

Attached is the lex code I would like to convert over to GPLEX:

%{
 #include <stdio.h>
 #include "y.tab.h"
%}
%%
[Oo][Rr]                return OR;
[Aa][Nn][Dd]            return AND;
[Nn][Oo][Tt]            return NOT;
[A-Za-z][A-Za-z0-9_]*   yylval=yytext; return ID;
%%

Thanks! Andrew

Upvotes: 7

Views: 10177

Answers (5)

Alex
Alex

Reputation: 389

First: include the reference "QUT.ShiftReduceParser.dll" in your Project. It is provided in the download-package from GPLEX.

Sample-Code for Main-Program:

using System;
using ....;
using QUT.Gppg;
using Scanner;
using Parser;

namespace NCParser
{
class Program
{
    static void Main(string[] args)
    {
        string pathTXT = @"C:\temp\testFile.txt";
        FileStream file = new FileStream(pathTXT, FileMode.Open);
        Scanner scanner = new Scanner();
        scanner.SetSource(file, 0);
        Parser parser = new Parser(scanner);            
    }
}
}    

Sample-Code for GPLEX:

%using Parser;           //include the namespace of the generated Parser-class
%namespace Scanner       //names the Namespace of the generated Scanner-class
%visibility public       //visibility of the types "Tokens","ScanBase","Scanner"
%scannertype Scanner     //names the Scannerclass to "Scanner"
%scanbasetype ScanBase   //names the Scanbaseclass to "ScanBase"
%tokentype Tokens        //names the Tokenenumeration to "Tokens"

%option codePage:65001 out:Scanner.cs /*see the documentation of GPLEX for further Options you can use */

%{ //user-specified code will be copied in the Output-file
%}

OR [Oo][Rr]
AND [Aa][Nn][Dd]
Identifier [A-Za-z][A-Za-z0-9_]*

%% //Rules Section
%{ //user-code that will be executed before getting the next token
%}

{OR}           {return (int)Tokens.kwOR;}
{AND}          {return (int)Tokens.kwAND;}
{Identifier}   {yylval = yytext; return (int)Tokens.ID;}

%% //User-code Section

Sample-Code for GPPG-input-file:

%using Scanner      //include the Namespace of the scanner-class
%output=Parser.cs   //names the output-file
%namespace Parser  //names the namespace of the Parser-class

%parsertype Parser      //names the Parserclass to "Parser"
%scanbasetype ScanBase  //names the ScanBaseclass to "ScanBase"
%tokentype Tokens       //names the Tokensenumeration to "Tokens"

%token kwAND "AND", kwOR "OR" //the received Tokens from GPLEX
%token ID

%% //Grammar Rules Section

program  : /* nothing */
         | Statements
         ;

Statements : EXPR "AND" EXPR
           | EXPR "OR" EXPR
           ;

EXPR : ID
     ;

%% //User-code Section
// Don't forget to declare the Parser-Constructor
public Parser(Scanner scnr) : base(scnr) { }

Upvotes: 4

ernstc
ernstc

Reputation: 66

Some time ago I have had the same need of using both GPLEX and GPPG together and for making the job much more easier I have created a nuget package for using GPPG and GPLEX together in Visual Studio.
This package can be installed in C# projects based on .Net Framework and adds some command-lets to the Package Manager Console in Visual Studio. This command-lets help you in configuring the C# project for integrating GPPG and GPLEX in the build process. Essentially in your project you will edit YACC and LEX files as source code and during the build of the project, the parser and the scanner will be generated. In addition the command-lets add to the projects the files needed for customizing the parser and the scanner.

You can find it here: https://www.nuget.org/packages/YaccLexTools/

And here is a link to the blog post that explains how to use it: http://ecianciotta-en.abriom.com/2013/08/yacclex-tools-v02.html

Upvotes: 1

greenoldman
greenoldman

Reputation: 21082

Irony, because when I jumped into parsers in C# I started exactly from those 2 tools (about a year ago). Then lexer has tiny bug (easy to fix):

but parser had more severe:

Lexer should be fixed (release date is June 2013), but parser probably still has this bug (May 2012).

So I wrote my own suite :-) https://sourceforge.net/projects/naivelangtools/ and use and develop it since then.

Your example translates (in NLT) to:

/[Oo][Rr]/                -> OR;
/[Aa][Nn][Dd]/            -> AND;
/[Nn][Oo][Tt]/            -> NOT;
// by default text is returned as value
/[A-Za-z][A-Za-z0-9_]*/   -> ID;

Entire suite is similar to lex/yacc, when possible it does not rely on side effects (so you return appropriate value).

Upvotes: 0

mmcelroy
mmcelroy

Reputation: 21

I had a similar issue - not knowing how to use my output from GPLEX with GPPG due to an apparent lack of documentation. I think the problem stems from the fact that the GPLEX distribution includes gppg.exe along with gplex.exe, but only documentation for GPLEX.

If you go the GPPG homepage and download that distribution, you'll get the documentation for GPPG, which describes the requirements for the input file, how to construct your grammar, etc. Oh, and you'll also get both binaries again - gppg.exe and gplex.exe.

It almost seems like it would be simpler to just include everything in one package. It could definitely clear up some confusion, especially for those who may be new to lexical analysis (tokenization) and parsing (and may not be 100% familiar yet with the differences between the two).

So anyways, for those who may doing this for the first time:

GPLEX http://gplex.codeplex.com - used for tokenization/scanning/lexical analysis (same thing)

GPPG http://gppg.codeplex.com/ - takes output from a tokenizer as input to parse. For example, parsers use grammars and can do things a simple tokenizer cannot, like detect whether sets of parentheses match up.

Upvotes: 2

lstern
lstern

Reputation: 1629

Have you considered using Roslyn? (This isn't a proper answer but I don't have enough reputation to post this as a comment)

Upvotes: 0

Related Questions