bartop
bartop

Reputation: 10315

Boost spirit x3 - lazy parser

Does latest boost::spirit::x3 implement lazy parser? I have found it in documentation but cannot find it in source code on github and can't use boost::spirit::x3::lazy. Am I missing something or lazy parsers were removed from spirit or renamed or somthng else?

Upvotes: 4

Views: 612

Answers (1)

sehe
sehe

Reputation: 392833

I thought I would try my hand here.

What is needed is some type-erasure around the iterator and attribute types. This is getting very close to the interface of a qi::rule in the old days.

To be complete we could actually also erase or transform contexts (e.g. to propagate the skipper inside the lazy rule), but I chose for simplicity here.

In many cases the parsers to be lazily invoked might be lexemes anyways (as in the sample I will use)

In our use-case, let's parse these inputs:

integer_value: 42
quoted_string: "hello world"
bool_value: true
double_value: 3.1415926

We'll use a variant attribute type, and start with creating a lazy_rule parser that will allow us to erase the types:

using Value = boost::variant<int, bool, double, std::string>;
using It    = std::string::const_iterator;
using Rule  = x3::any_parser<It, Value>;

Passing The Lazy Subject Around

Now, where do we "get" the lazy subject from?

In Spirit Qi, we had the Nabialek Trick. This would use qi::locals<> or inherited attributes, which basically both boiled down to using Phoenix lazy actors (qi::_r1 or qi::_a etc) to evaluate to a value from parser context at runtime.

In X3 there is no Phoenix, and we will have to manipulate the context using semantic actions ourselves.

The basic building block for this is the x3::with<T>[] directive¹. Here's what we'll end up using as the parser:

x3::symbols<Rule> options;

Now we can add any parse expression to the options, by saying e.g. options.add("anything", x3::eps);.

auto const parser = x3::with<Rule>(Rule{}) [
    set_context<Rule>[options] >> ':' >> lazy<Rule>
];

This adds a Rule value to the context, which can be set (set_context) and "executed" (lazy).

Like I said, we have to manipulate the context manually, so let's define some helpers that do this:

template <typename Tag>
struct set_context_type {
    template <typename P>
    auto operator[](P p) const {
        auto action = [](auto& ctx) {
            x3::get<Tag>(ctx) = x3::_attr(ctx);
        };
        return x3::omit [ p [ action ] ];
    }
};

template <typename Tag>
struct lazy_type : x3::parser<lazy_type<Tag>> {
    using attribute_type = typename Tag::attribute_type; // TODO FIXME?

    template<typename It, typename Ctx, typename RCtx, typename Attr>
    bool parse(It& first, It last, Ctx& ctx, RCtx& rctx, Attr& attr) const {
        auto& subject = x3::get<Tag>(ctx);

        It saved = first;
        x3::skip_over(first, last, ctx);
        if (x3::as_parser(subject).parse(first, last,
                                         std::forward<Ctx>(ctx),
                                         std::forward<RCtx>(rctx), attr)) {
            return true;
        } else {
            first = saved;
            return false;
        }
    }
};

template <typename T> static const set_context_type<T> set_context{};
template <typename T> static const lazy_type<T> lazy{};

That's really all there is to it.

Demo Time

In this demo, we run the above inputs (in function run_tests()) and it will use the parser as shown:

auto run_tests = [=] {
    for (std::string const& input : {
            "integer_value: 42",
            "quoted_string: \"hello world\"",
            "bool_value: true",
            "double_value: 3.1415926",
        })
    {
        Value attr;
        std::cout << std::setw(36) << std::quoted(input);
        if (phrase_parse(begin(input), end(input), parser, x3::space, attr)) {
            std::cout << " -> success (" << attr << ")\n";
        } else {
            std::cout << " -> failed\n";
        }
    }
};

First we will run:

options.add("integer_value", x3::int_);
options.add("quoted_string", as<std::string> [
        // lexeme is actually redundant because we don't use surrounding skipper yet
        x3::lexeme [ '"' >> *('\\' >> x3::char_ | ~x3::char_('"')) >> '"' ]
    ]);
run_tests();

Which will print:

"integer_value: 42"                  -> success (42)
"quoted_string: \"hello world\""     -> success (hello world)
"bool_value: true"                   -> failed
"double_value: 3.1415926"            -> failed

Now, we can demonstrate the dynamic nature of that parser, by extending options:

options.add("double_value", x3::double_);
options.add("bool_value", x3::bool_);

run_tests();

And the output becomes:

"integer_value: 42"                  -> success (42)
"quoted_string: \"hello world\""     -> success (hello world)
"bool_value: true"                   -> success (true)
"double_value: 3.1415926"            -> success (3.14159)

Note, I threw in another helper as<> that makes it easier to coerce the attribute type to std::string there. It's an evolution of ideas in earlier answers

Full Listing Live On Coliru

See it Live On Coliru

#include <boost/spirit/home/x3.hpp>
#include <iostream>
#include <iomanip>

namespace x3 = boost::spirit::x3;

namespace {
    template <typename T>
    struct as_type {
        template <typename...> struct Tag{};

        template <typename P>
        auto operator[](P p) const {
            return x3::rule<Tag<T, P>, T> {"as"} = x3::as_parser(p);
        }
    };

    template <typename Tag>
    struct set_lazy_type {
        template <typename P>
        auto operator[](P p) const {
            auto action = [](auto& ctx) {
                x3::get<Tag>(ctx) = x3::_attr(ctx);
            };
            return x3::omit [ p [ action ] ];
        }
    };

    template <typename Tag>
    struct do_lazy_type : x3::parser<do_lazy_type<Tag>> {
        using attribute_type = typename Tag::attribute_type; // TODO FIXME?

        template <typename It, typename Ctx, typename RCtx, typename Attr>
        bool parse(It& first, It last, Ctx& ctx, RCtx& rctx, Attr& attr) const {
            auto& subject = x3::get<Tag>(ctx);

            It saved = first;
            x3::skip_over(first, last, ctx);
            if (x3::as_parser(subject).parse(first, last,
                                             std::forward<Ctx>(ctx),
                                             std::forward<RCtx>(rctx), attr)) {
                return true;
            } else {
                first = saved;
                return false;
            }
        }
    };

    template <typename T> static const as_type<T>       as{};
    template <typename T> static const set_lazy_type<T> set_lazy{};
    template <typename T> static const do_lazy_type<T>  do_lazy{};
}

int main() {
    std::cout << std::boolalpha << std::left;

    using Value = boost::variant<int, bool, double, std::string>;
    using It    = std::string::const_iterator;
    using Rule  = x3::any_parser<It, Value>;

    x3::symbols<Rule> options;

    auto const parser = x3::with<Rule>(Rule{}) [
        set_lazy<Rule>[options] >> ':' >> do_lazy<Rule>
    ];

    auto run_tests = [=] {
        for (std::string const input : {
                "integer_value: 42",
                "quoted_string: \"hello world\"",
                "bool_value: true",
                "double_value: 3.1415926",
            })
        {
            Value attr;
            std::cout << std::setw(36) << std::quoted(input);
            if (phrase_parse(begin(input), end(input), parser, x3::space, attr)) {
                std::cout << " -> success (" << attr << ")\n";
            } else {
                std::cout << " -> failed\n";
            }
        }
    };

    std::cout << "Supporting only integer_value and quoted_string:\n";
    options.add("integer_value", x3::int_);
    options.add("quoted_string", as<std::string> [
            // lexeme is actually redundant because we don't use surrounding skipper yet
            x3::lexeme [ '"' >> *('\\' >> x3::char_ | ~x3::char_('"')) >> '"' ]
        ]);
    run_tests();

    std::cout << "\nAdded support for double_value and bool_value:\n";
    options.add("double_value", x3::double_);
    options.add("bool_value", x3::bool_);

    run_tests();
}

Printing the full output of:

Supporting only integer_value and quoted_string:
"integer_value: 42"                  -> success (42)
"quoted_string: \"hello world\""     -> success (hello world)
"bool_value: true"                   -> failed
"double_value: 3.1415926"            -> failed

Added support for double_value and bool_value:
"integer_value: 42"                  -> success (42)
"quoted_string: \"hello world\""     -> success (hello world)
"bool_value: true"                   -> success (true)
"double_value: 3.1415926"            -> success (3.14159)

¹ sadly the documentation is missing in action

Upvotes: 4

Related Questions