stix
stix

Reputation: 1146

How can I reserve a set of keywords in a name field in boost spirit?

I have the following definition for an object record in PureData that I need to be able to parse into my generic PdObject struct:

Description:
Defines an object
Syntax:
#X obj [x_pos] [y_pos] [object_name] [p1] [p2] [p3] [...];\r\n
Parameters:
[x_pos] - horizontal position within the window
[y_pos] - vertical position within the window
[object_name] - name of the object (optional)
[p1] [p2] [p3] [...] the parameters of the object (optional)
Example:
#X obj 55 50;
#X obj 132 72 trigger bang float;

And I have created the following boost spirit rule that has been tested to work:

template <typename Iterator> struct PdObjectGrammar : qi::grammar<Iterator, PdObject()> { 
    PdObjectGrammar() : PdObjectGrammar::base_type(start) { 
        using namespace qi; 
        start = skip(space)[objectRule]; 
        pdStringRule = +(('\\'  >> space) | (graph-lit(";"))); 
        objectRule = "#X obj" >> int_ >> int_ >> -(pdStringRule) >> *(pdStringRule) >> ";"; 
        BOOST_SPIRIT_DEBUG_NODES((start)(objectRule)(pdStringRule))
    }
    private: 
    qi::rule<Iterator, std::string()> pdStringRule; 
    qi::rule<Iterator, PdObject()> start; 
    qi::rule<Iterator, PdObject(), qi::space_type> objectRule; 

};

However, there are also special "reserved names" that cannot be used, such as "bng," "tgl," "nbx," etc...

For example, here is another type of "obj" using a reserved name keyword that must be parsed separately by a different rule:

#X obj 92 146 bng 20 250 50 0 empty empty empty 0 -10 0 12 #fcfcfc #000000 #000000;

How can I modify my previous qi rule to not parse the above string, and leave it for another grammar to check (which would parse it to a different struct)?

Postscript:

My full test for the PdObjectGrammar is:

#define BOOST_SPIRIT_DEBUG
#include <boost/spirit/include/qi.hpp>

#include <string> 
#include <vector>
#include <fstream>


namespace qi = boost::spirit::qi;

struct PdObject {
int xPos;
int yPos;
std::string name;
std::vector<std::string> params;

};


BOOST_FUSION_ADAPT_STRUCT(
    PdObject,
    xPos,
    yPos,
    name,
    params
)

template <typename Iterator> struct PdObjectGrammar : qi::grammar<Iterator, PdObject()> { 
    PdObjectGrammar() : PdObjectGrammar::base_type(start) { 
        using namespace qi; 
        start = skip(space)[objectRule]; 
        pdStringRule = +(('\\'  >> space) | (graph-lit(";"))); 
        objectRule = "#X obj" >> int_ >> int_ >> -(pdStringRule) >> *(pdStringRule) >> ";"; 
        BOOST_SPIRIT_DEBUG_NODES((start)(objectRule)(pdStringRule))
    }
    private: 
    qi::rule<Iterator, std::string()> pdStringRule; 
    qi::rule<Iterator, PdObject()> start; 
    qi::rule<Iterator, PdObject(), qi::space_type> objectRule; 

};


int main(int argc, char** argv)
{
  if(argc != 2)
    {
        std::cout << "Usage: "  <<argv[0] << " <PatchFile>" << std::endl;
        exit(1); 
    }

    std::ifstream inputFile(argv[1]); 
    std::string inputString(std::istreambuf_iterator<char>(inputFile), {}); 

    PdObject msg;
    PdObjectGrammar<std::string::iterator> parser; 

    bool success = qi::phrase_parse(inputString.begin(), inputString.end(), parser, boost::spirit::ascii::space, msg); 
    std::cout << "Success: " << success << std::endl;

    return 0; 

}

Upvotes: 1

Views: 78

Answers (1)

sehe
sehe

Reputation: 393674

In a way "keywordness" is not part of the grammar. It's a semantic check.

There's not a standard way in which grammars deal with keywords. For example C++ has a number of identifiers that are contextually reserved only.

The short story of it is you will just have to express your constraints in code or validate semantics after-the-fact (on the parsed result).

Naively: Live

string     = +('\\' >> qi::space | qi::graph - ";");
name       = string - "bng" - "tgl" - "nbx" - "vsl" - "hsl" - "vradio" - "hradio" - "vu" - "cnv";
object     = "#X obj"       //
    >> qi::int_ >> qi::int_ //
    >> -name                //
    >> *string >> ";";

Or Live

string     = +('\\' >> qi::space | qi::graph - ";");
builtin    = qi::lit("bng") | "tgl" | "nbx" | "vsl" | "hsl" | "vradio" | "hradio" - "vu" | "cnv";
object     = "#X obj"        //
    >> qi::int_ >> qi::int_  //
    >> -(!builtin >> string) //
    >> *string >> ";";

Symbols

You can make this a bit more elegant, maintainable and possibly more efficient by defining a symbol for it: Live

qi::symbols<char> builtin;


// ...
builtin += "bng", "tgl", "nbx", "vsl", "hsl", "vradio", "hradio", "vu", "cnv";

string = +('\\' >> qi::space | qi::graph - ";");
object = "#X obj"                //
         >> qi::int_ >> qi::int_ //
         >> -(string - builtin)    //
         >> *string >> ";";

Distinct Keywords

There's a flaw. When the user names their object something starting with the builtin list, like bngalore or vslander the builtins will match so the name would be rejected: Live

To account for this, make sure we're on a lexeme boundary: Live

auto kw = [](auto const& p) { return qi::copy(qi::lexeme[p >> !(qi::graph - ';')]); };
string = +('\\' >> qi::space | qi::graph - ";");
object = "#X obj"                //
    >> qi::int_ >> qi::int_      //
    >> -(!kw(builtin) >> string) //
    >> *string >> ";";

It doesn't work!

That's because the grammar is flawed. In your defense, the specification is extremely sloppy. It's one of those grammars alright.

With all those things being optional, you should ask yourself, how does the parser know that name is omitted, when there are parameters? As far as I can see the parser could never tell, so when the name is omitted, there cannot be parameters?

We can express that: Live

string = +('\\' >> qi::space | qi::graph - ";");
object = "#X obj"                                   //
    >> qi::int_ >> qi::int_                         //
    >> !kw(builtin) >> -(string >> *string) >> ";"; //

Oh noes, now the entire (string >> *string) is compatible with just the name attribute...:

Input: "#X obj 132 72 trigger bang float;"
 -> (132 72 "triggerbangfloat" { })

Here I'd advise to adjust the AST to reflect the parsed grammar:

struct GenericObject {
    String              name;
    std::vector<String> params;
};

struct PdObject {
    int           xPos, yPos;
    GenericObject generic;
};

BOOST_FUSION_ADAPT_STRUCT(PdObject, xPos, yPos, generic)
BOOST_FUSION_ADAPT_STRUCT(GenericObject, name, params)

Now, it does propagate the attributes correctly: Live, note the extra sub-object (()) in the output:

Input: "#X obj 132 72 trigger bang float;"
 -> (132 72 ("trigger" { "bang" "float" }))

Taking It All The Way

As a pro tip, don't implement the parser in the same sloppy fashion as the specification was done. Likely, you just want to parse different object types with dedicated AST types and ditto rules.

For really advanced/pluggable grammars, you might dispatch the rules based on the name symbol. That's known as the Nabialek Trick.

Let's generalize our object rule:

object = "#X obj"           //
    >> qi::int_ >> qi::int_ //
    >> definition           //
    >> ";"                  //
    ;

Now let's demo the VSL rule, in addition to generic objects:

definition = vslider | generic;

Generic is still what we had before:

generic           //
    = opt(string) // name
    >> *string;   // params

Let's do a rough take on Vslider:

vslider                             //
    = qi::lexeme["vsl" >> boundary] //
    >> opt(qi::uint_)               // width
    >> opt(qi::uint_)               // height
    >> opt(qi::double_)             // bottom
    >> opt(qi::double_)             // top
    >> opt(bool_)                   // log
    >> opt(bool_)                   // init
    >> opt(string)                  // send
    >> opt(string)                  // receive
    >> opt(string)                  // label
    >> opt(qi::int_)                // x_off
    >> opt(qi::int_)                // y_off
    >> opt(string)                  // font
    >> opt(qi::uint_)               // fontsize
    >> opt(rgb)                     // bg_color
    >> opt(rgb)                     // fg_colo
    >> opt(rgb)                     // label_color
    >> opt(qi::double_)             // default_value
    >> opt(bool_)                   // steady_on_click
    ;

Of course we need a few helpers:

qi::uint_parser<int32_t, 16, 6, 6> hex6{};
rgb = ('#' >> hex6) | qi::int_;

auto boundary = qi::copy(!(qi::graph - ';'));
auto opt = [](auto const& p) { return qi::copy(p | &qi::lit(';')); };

bool_ = qi::bool_ | qi::uint_parser<bool, 2, 1, 1>{};

And the AST types:

struct RGB {
    int32_t rgb;
};

namespace Defs {
    using boost::optional;

    struct Generic {
        String              name;
        std::vector<String> params;
    };

    struct Vslider {
        optional<unsigned> width;           // horizontal size of gui element
        optional<unsigned> height;          // vertical size of gui element
        optional<double>   bottom;          // minimum value
        optional<double>   top;             // maximum value
        bool               log = false;     // when set the slider range is outputted
                                            // logarithmically, otherwise it's output
                                            // is linair
        String           init;              // sends default value on patch load
        String           send;              // send symbol name
        String           receive;           // receive symbol name
        optional<String> label;             // label
        int              x_off = 0;         // horizontal position of the label
                                            // text relative to the upperleft
                                            // corner of the object
        int y_off = 0;                      // vertical position of the label
                                            // text relative to the upperleft
                                            // corner of the object
        optional<String>   font;            // font type
        optional<unsigned> fontsize;        // font size
        optional<RGB>      bg_color;        // background color
        optional<RGB>      fg_color;        // foreground color
        optional<RGB>      label_color;     // label color
        optional<double>   default_value;   // default value times hundred
        optional<bool>     steady_on_click; // when set, fader is steady on click,
                                            // otherwise it jumps on click
    };

    using Definition = boost::variant<Vslider, Generic>;
} // namespace Defs

using Defs::Definition;

struct PdObject {
    int        xPos, yPos;
    Definition definition;
};

Putting it all together:

Full Demo

Live On Coliru

// #define BOOST_SPIRIT_DEBUG
#include <boost/core/demangle.hpp>
#include <boost/fusion/adapted.hpp>
#include <boost/fusion/include/io.hpp>
#include <boost/optional/optional_io.hpp>
#include <boost/spirit/include/qi.hpp>
#include <iomanip>

namespace Ast {
    // C++ makes it hard to pretty-print containers...
    struct print_hack : std::char_traits<char> {};
    using String = std::basic_string<char, print_hack>;
    static inline std::ostream& operator<<(std::ostream& os, String const& s) { return os << quoted(s); }
    static inline std::ostream& operator<<(std::ostream& os, std::vector<String> const& ss) {
        os << "{";
        for (auto& s : ss) os << " " << s;
        return os << " }";
    }

    struct RGB {
        int32_t rgb;
    };

    namespace Defs {
        using boost::optional;

        struct Generic {
            String              name;
            std::vector<String> params;
        };

        struct Vslider {
            optional<unsigned> width;           // horizontal size of gui element
            optional<unsigned> height;          // vertical size of gui element
            optional<double>   bottom;          // minimum value
            optional<double>   top;             // maximum value
            bool               log = false;     // when set the slider range is outputted
                                                // logarithmically, otherwise it's output
                                                // is linair
            String           init;              // sends default value on patch load
            String           send;              // send symbol name
            String           receive;           // receive symbol name
            optional<String> label;             // label
            int              x_off = 0;         // horizontal position of the label
                                                // text relative to the upperleft
                                                // corner of the object
            int y_off = 0;                      // vertical position of the label
                                                // text relative to the upperleft
                                                // corner of the object
            optional<String>   font;            // font type
            optional<unsigned> fontsize;        // font size
            optional<RGB>      bg_color;        // background color
            optional<RGB>      fg_color;        // foreground color
            optional<RGB>      label_color;     // label color
            optional<double>   default_value;   // default value times hundred
            optional<bool>     steady_on_click; // when set, fader is steady on click,
                                                // otherwise it jumps on click
        };

        using Definition = boost::variant<Generic, Vslider>;

        using boost::fusion::operator<<;
    } // namespace Defs

    using Defs::Definition;

    struct PdObject {
        int        xPos, yPos;
        Definition definition;
    };

    using boost::fusion::operator<<;
}

BOOST_FUSION_ADAPT_STRUCT(Ast::Defs::Vslider, width, height, bottom, top, log, init, send, receive, label,
                          x_off, y_off, font, fontsize, bg_color, fg_color, label_color, default_value,
                          steady_on_click)
BOOST_FUSION_ADAPT_STRUCT(Ast::Defs::Generic, name, params)
BOOST_FUSION_ADAPT_STRUCT(Ast::RGB, rgb)
BOOST_FUSION_ADAPT_STRUCT(Ast::PdObject, xPos, yPos, definition)

namespace qi = boost::spirit::qi;

template <typename Iterator> struct PdObjectGrammar : qi::grammar<Iterator, Ast::PdObject()> {
    PdObjectGrammar() : PdObjectGrammar::base_type(start) {
        start = qi::skip(qi::blank)[ object ];

        /* #X obj [x_pos] [y_pos] [object_name] [p1] [p2] [p3] [...];\r\n
         * Parameters:
         *  [x_pos] - horizontal position within the window
         *  [y_pos] - vertical position within the window
         *  [object_name] - name of the object (optional)
         *  [p1] [p2] [p3] [...] the parameters of the object (optional)
         */
        qi::uint_parser<int32_t, 16, 6, 6> hex6{};
        rgb = ('#' >> hex6) | qi::int_;

        auto boundary = qi::copy(!(qi::graph - ';'));
        auto opt = [](auto const& p) { return qi::copy(p | &qi::lit(';')); };

        bool_ = qi::bool_ | qi::uint_parser<bool, 2, 1, 1>{};

        vslider                             //
            = qi::lexeme["vsl" >> boundary] //
            >> opt(qi::uint_)               // width
            >> opt(qi::uint_)               // height
            >> opt(qi::double_)             // bottom
            >> opt(qi::double_)             // top
            >> opt(bool_)                   // log
            >> opt(bool_)                   // init
            >> opt(string)                  // send
            >> opt(string)                  // receive
            >> opt(string)                  // label
            >> opt(qi::int_)                // x_off
            >> opt(qi::int_)                // y_off
            >> opt(string)                  // font
            >> opt(qi::uint_)               // fontsize
            >> opt(rgb)                     // bg_color
            >> opt(rgb)                     // fg_colo
            >> opt(rgb)                     // label_color
            >> opt(qi::double_)             // default_value
            >> opt(bool_)                   // steady_on_click
            ;

        generic           //
            = opt(string) // name
            >> *string;   // params

        definition = vslider | generic;

        string = +('\\' >> qi::space | qi::graph - ";");
        object = "#X obj"           //
            >> qi::int_ >> qi::int_ //
            >> definition           //
            >> ";"                  //
            ;

        BOOST_SPIRIT_DEBUG_NODES(          //
            (start)(object)(string)(rgb)   //
            (definition)(vslider)(generic) //
            (bool_))                       //
    }

  private:
    using Skipper = qi::blank_type;
    qi::rule<Iterator, Ast::PdObject(),         Skipper> object;
    qi::rule<Iterator, Ast::Defs::Vslider(),    Skipper> vslider;
    qi::rule<Iterator, Ast::Defs::Generic(),    Skipper> generic;
    qi::rule<Iterator, Ast::Defs::Definition(), Skipper> definition;

    // lexemes
    qi::rule<Iterator, bool()>          bool_;
    qi::rule<Iterator, Ast::RGB()>      rgb;
    qi::rule<Iterator, Ast::String()>   string;
    qi::rule<Iterator, Ast::PdObject()> start;
};

int main()
{
    PdObjectGrammar<std::string::const_iterator> const parser;

    for (std::string const input :
         {
             "#X obj 55 50;",
             "#X obj 92 146 bng 20 250 50 0 empty empty empty 0 -10 0 12 #fcfcfc #000000 #000000;",
             "#X obj 50 38 vsl 15 128 0 127 0 0 empty empty empty 0 -8 0 8 -262144 -1 -1 0 1;",
         }) //
    {
        Ast::PdObject msg;

        auto f = input.begin(), l = input.end();
        std::cout << "Input: " << quoted(input) << std::endl;
        if (qi::parse(f, l, parser, msg)) {
            std::cout << " -> " << boost::core::demangle(msg.definition.type().name()) << std::endl;
            std::cout << " -> " << msg << std::endl;
        } else
            std::cout << " -> FAILED" << std::endl;

        if (f != l)
            std::cout << " Remaining: " << quoted(std::string(f, l)) << std::endl;
    }
}

Prints

Input: "#X obj 55 50;"
 -> Ast::Defs::Generic
 -> (55 50 ("" { }))
Input: "#X obj 92 146 bng 20 250 50 0 empty empty empty 0 -10 0 12 #fcfcfc #000000 #000000;"
 -> Ast::Defs::Generic
 -> (92 146 ("bng" { "20" "250" "50" "0" "empty" "empty" "empty" "0" "-10" "0" "12" "#fcfcfc" "#000000" "#000000" }))
Input: "#X obj 50 38 vsl 15 128 0 127 0 0 empty empty empty 0 -8 0 8 -262144 -1 -1 0 1;"
 -> Ast::Defs::Vslider
 -> (50 38 ( 15  128  0  127 0 "" "empty" "empty"  "empty" 0 -8  "0"  8  (-262144)  (-1)  (-1)  0  1))

Note how we parse bng as Generic by default, simply because we didn't add a definition rule for it yet. Adding it: Live:

Input: "#X obj 55 50;"
 -> Ast::Defs::Generic
 -> (55 50 ("" { }))
Input: "#X obj 92 146 bng 20 250 50 0 empty empty empty 0 -10 0 12 #fcfcfc #000000 #000000;"
 -> Ast::Defs::Bang
 -> (92 146 ( 20  250  2 "" "empty" "empty" "empty"  0  -10  "0"  12  (16579836)  (0)  (0)))
Input: "#X obj 50 38 vsl 15 128 0 127 0 0 empty empty empty 0 -8 0 8 -262144 -1 -1 0 1;"
 -> Ast::Defs::Vslider
 -> (50 38 ( 15  128  0  127 0 "" "empty" "empty"  "empty" 0 -8  "0"  8  (-262144)  (-1)  (-1)  0  1))

That was basically 1:1 copy-paste from the PureData grammar docs.

Of course, my fingers itch to remove the duplication of init, send, receive, label, x_off, y_off, font, fontsize, bg_color, fg_color and label_color... But I'll leave it as an exorcism for the reader.

Upvotes: 1

Related Questions