grisha
grisha

Reputation: 1297

Learning Boost.Spirit: parsing INI

I started to learn Boost.Spirit and finish reading Qi - Writing Parsers section. When reading, everything is easy and understandable. But when I try to do something, there are a lot of errors, because there are too many includes and namespaces and I need to know when to include/use them. As the practice, I want to write simple INI parser.

Here is the code (includes are from one of examples inside Spirit lib as almost everything else):

#include <boost/config/warning_disable.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix_core.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <boost/spirit/include/phoenix_stl.hpp>
#include <boost/fusion/adapted/std_pair.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/spirit/include/phoenix_object.hpp>

#include <iostream>
#include <string>
#include <vector>
#include <map>

namespace client
{
    typedef std::map<std::string, std::string> key_value_map_t;

    struct mini_ini
    {
        std::string name;
        key_value_map_t key_values_map;
    };
} // client

BOOST_FUSION_ADAPT_STRUCT(
    client::mini_ini,
    (std::string, name)
    (client::key_value_map_t, key_values_map)
)

namespace client
{
    namespace qi = boost::spirit::qi;
    namespace ascii = boost::spirit::ascii;
    namespace phoenix = boost::phoenix;

    template <typename Iterator>
    struct ini_grammar : qi::grammar<Iterator, mini_ini(), ascii::space_type>
    {
        ini_grammar() : ini_grammar::base_type(section_, "section")
        {
            using qi::char_;
            using qi::on_error;
            using qi::fail;
            using namespace qi::labels;
            using phoenix::construct;
            using phoenix::val;

            key_ = +char_("a-zA-Z_0-9");
            pair_ = key_ >> '=' >> *char_;
            section_ = '[' >> key_ >> ']' >> '\n' >> *(pair_ >> '\n');

            key_.name("key");
            pair_.name("pair");
            section_.name("section");

            on_error<fail>
            (
                section_
              , std::cout
                    << val("Error! Expecting ")
                    << _4                               // what failed?
                    << val(" here: \"")
                    << construct<std::string>(_3, _2)   // iterators to error-pos, end
                    << val("\"")
                    << std::endl
            );
        }

        qi::rule<Iterator, std::string(), ascii::space_type> key_;
        qi::rule<Iterator, mini_ini(), ascii::space_type> section_;
        qi::rule<Iterator, std::pair<std::string, std::string>(), ascii::space_type> pair_;
    };
} // client

int
main()
{
    std::string storage =
        "[section]\n"
        "key1=val1\n"
        "key2=val2\n";
    client::mini_ini ini;
    typedef client::ini_grammar<std::string::const_iterator> ini_grammar;
    ini_grammar grammar;

    using boost::spirit::ascii::space;
    std::string::const_iterator iter = storage.begin();
    std::string::const_iterator end = storage.end();
    bool r = phrase_parse(iter, end, grammar, space, ini);

    if (r && iter == end)
    {
        std::cout << "-------------------------\n";
        std::cout << "Parsing succeeded\n";
        std::cout << "-------------------------\n";

        return 0;
    }
    else
    {
        std::cout << "-------------------------\n";
        std::cout << "Parsing failed\n";
        std::cout << "-------------------------\n";
        std::cout << std::string(iter, end) << "\n";
        return 1;
    }

    return 0;
}

As u can see I want to parse next text into mini_ini struct:

"[section]"
"key1=val1"
"key2=val2";

I have the fail and std::string(iter, end) is full input string.

My questions:

Thanks

Upvotes: 1

Views: 644

Answers (1)

sehe
sehe

Reputation: 393154

Q. Why I see fail but don't see on_error handler

The on_error handler is only fired for the registered rule (section_) and if an expectation point is failed.

Your grammar doesn't contain expectation points (only >> are used, not >).

Q. Have you any recommendations how to learn Boost.Spirit (I have good understanding of documentation in theory, but in practice I have a lot of WHY ???) ?

Just build the parsers you need. Copy good conventions from the docs and SO answers. There are a lot of them. As you have seen, quite a number contain full examples of Ini parsers with varying levels of error reporting too.

Bonus hints:

Do more detailed status reporting:

bool ok = phrase_parse(iter, end, grammar, space, ini);

if (ok) {
    std::cout << "Parse success\n";
} else {
    std::cout << "Parse failure\n";
}

if (iter != end) {
    std::cout << "Remaining unparsed: '" << std::string(iter, end) << "'\n";
}

return ok && (iter==end)? 0 : 1;

Use BOOST_SPIRIT_DEBUG:

#define BOOST_SPIRIT_DEBUG

// and later
BOOST_SPIRIT_DEBUG_NODES((key_)(pair_)(section_))

Prints:

<section_>
  <try>[section]\nkey1=val1\n</try>
  <key_>
    <try>section]\nkey1=val1\nk</try>
    <success>]\nkey1=val1\nkey2=val</success>
    <attributes>[[s, e, c, t, i, o, n]]</attributes>
  </key_>
  <fail/>
</section_>
Parse failure
Remaining unparsed: '[section]
key1=val1
key2=val2
'

You'll notice that the section header isn't parsed because the newline is not matched. Your skipper (space_type) skips the newline, hence it will never match: Boost spirit skipper issues

Fix skipper

When using blank_type as the skipper you'll get a successful parse:

<section_>
<try>[section]\nkey1=val1\n</try>
<key_>
    <try>section]\nkey1=val1\nk</try>
    <success>]\nkey1=val1\nkey2=val</success>
    <attributes>[[s, e, c, t, i, o, n]]</attributes>
</key_>
<pair_>
    <try>key1=val1\nkey2=val2\n</try>
    <key_>
    <try>key1=val1\nkey2=val2\n</try>
    <success>=val1\nkey2=val2\n</success>
    <attributes>[[k, e, y, 1]]</attributes>
    </key_>
    <success></success>
    <attributes>[[[k, e, y, 1], [v, a, l, 1, 
, k, e, y, 2, =, v, a, l, 2, 
]]]</attributes>
</pair_>
<success>key1=val1\nkey2=val2\n</success>
<attributes>[[[s, e, c, t, i, o, n], []]]</attributes>
</section_>
Parse success
Remaining unparsed: 'key1=val1
key2=val2

NOTE: The parse succeeds but doesn't do what you want. This is because *char_ includes newlines. So make that

       pair_ = key_ >> '=' >> *(char_ - qi::eol); // or
       pair_ = key_ >> '=' >> *~char_("\r\n"); // etc

Full code

Live On Coliru

#define BOOST_SPIRIT_DEBUG
#include <boost/config/warning_disable.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix_core.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <boost/spirit/include/phoenix_stl.hpp>
#include <boost/fusion/adapted/std_pair.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/spirit/include/phoenix_object.hpp>

#include <iostream>
#include <string>
#include <vector>
#include <map>

namespace client
{
    typedef std::map<std::string, std::string> key_value_map_t;

    struct mini_ini
    {
        std::string name;
        key_value_map_t key_values_map;
    };
} // client

BOOST_FUSION_ADAPT_STRUCT(
    client::mini_ini,
    (std::string, name)
    (client::key_value_map_t, key_values_map)
)

namespace client
{
    namespace qi      = boost::spirit::qi;
    namespace ascii   = boost::spirit::ascii;
    namespace phoenix = boost::phoenix;

    template <typename Iterator>
    struct ini_grammar : qi::grammar<Iterator, mini_ini(), ascii::blank_type>
    {
        ini_grammar() : ini_grammar::base_type(section_, "section")
        {
            using qi::char_;
            using qi::on_error;
            using qi::fail;
            using namespace qi::labels;
            using phoenix::construct;
            using phoenix::val;

            key_ = +char_("a-zA-Z_0-9");
            pair_ = key_ >> '=' >> *char_;
            section_ = '[' >> key_ >> ']' >> '\n' >> *(pair_ >> '\n');

            BOOST_SPIRIT_DEBUG_NODES((key_)(pair_)(section_))

            on_error<fail>
            (
                section_
              , std::cout
                    << val("Error! Expecting ")
                    << _4                               // what failed?
                    << val(" here: \"")
                    << construct<std::string>(_3, _2)   // iterators to error-pos, end
                    << val("\"")
                    << std::endl
            );
        }

        qi::rule<Iterator, std::string(), ascii::blank_type> key_;
        qi::rule<Iterator, mini_ini(), ascii::blank_type> section_;
        qi::rule<Iterator, std::pair<std::string, std::string>(), ascii::blank_type> pair_;
    };
} // client

int
main()
{
    std::string storage =
        "[section]\n"
        "key1=val1\n"
        "key2=val2\n";
    client::mini_ini ini;
    typedef client::ini_grammar<std::string::const_iterator> ini_grammar;
    ini_grammar grammar;

    using boost::spirit::ascii::blank;
    std::string::const_iterator iter = storage.begin();
    std::string::const_iterator end = storage.end();
    bool ok = phrase_parse(iter, end, grammar, blank, ini);

    if (ok) {
        std::cout << "Parse success\n";
    } else {
        std::cout << "Parse failure\n";
    }

    if (iter != end) {
        std::cout << "Remaining unparsed: '" << std::string(iter, end) << "'\n";
    }

    return ok && (iter==end)? 0 : 1;
}

Upvotes: 2

Related Questions