Exagon
Exagon

Reputation: 5098

Strange semantic behaviour of boost spirit x3 after splitting

I came across a strange behaviour of boost spirit x3, after I splittet my grammar up into the recommended parser.hpp, parser_def.hpp, parser.cpp files. My example gramar parses some kind of easy enums:

enum = "enum" > identifier > "{" > identifier % "," > "}

this is my enum grammar. When I don't split the enum and identifier parser into the recommended files, everything works fine, especially the string "enum {foo, bar}" throws an expectation failure, as expected. This example can be found here: unsplitted working example

But when I split the exactly same grammar up into the different files, the parser throws

terminate called after throwing an instance of 'std::logic_error'
  what():  basic_string::_M_construct null not valid

trying to parse the same string "enum {foo, bar}"

this example can be found here: splitted strange example

  1. ast.hpp

    #pragma once
    
    #include <vector>
    #include <string>
    #include <boost/fusion/include/adapt_struct.hpp>
    
    
    
    namespace ast{
    
    namespace x3 = boost::spirit::x3;
    
    struct Enum {
        std::string _name;
        std::vector<std::string> _elements;
    };
    
    
    }
    
    BOOST_FUSION_ADAPT_STRUCT(ast::Enum, _name, _elements)
    
  2. config.hpp

    #pragma once 
    
    #include <boost/spirit/home/x3.hpp>
    
    namespace parser{
    
        namespace x3 = boost::spirit::x3;
    
        typedef std::string::const_iterator iterator_type;
        typedef x3::phrase_parse_context<x3::ascii::space_type>::type context_type;
    
    }
    
  3. enum.cpp

    #include "enum_def.hpp"
    #include "config.hpp"
    
    namespace parser { namespace impl {
         BOOST_SPIRIT_INSTANTIATE(enum_type, iterator_type, context_type)
    }}
    
    namespace parser {
    
    const impl::enum_type& enum_parser()
    {
        return impl::enum_parser;
    }
    
    }
    
  4. enum_def.hpp

    #pragma once
    
    #include "identifier.hpp"
    #include "enum.hpp"
    #include "ast.hpp"
    
    namespace parser{ namespace impl{
    
        namespace x3=boost::spirit::x3;
    
        const enum_type enum_parser = "enum";
    
        namespace{
            const auto& identifier = parser::identifier();
        }
        auto const enum_parser_def =
            "enum"
            > identifier
            > "{"
            > identifier % ","
            >"}";
    
        BOOST_SPIRIT_DEFINE(enum_parser)
    }}
    
  5. enum.hpp

    #pragma once
    
    #include <boost/spirit/home/x3.hpp>
    #include "ast.hpp"
    
    namespace parser{ namespace impl{
        namespace x3=boost::spirit::x3;
    
        typedef x3::rule<class enum_class, ast::Enum> enum_type;
    
        BOOST_SPIRIT_DECLARE(enum_type)
    
    }}
    
    namespace parser{
        const impl::enum_type& enum_parser();
    }
    
  6. identifier.cpp

    #include "identifier_def.hpp"
    #include "config.hpp"
    
    namespace parser { namespace impl {
         BOOST_SPIRIT_INSTANTIATE(identifier_type, iterator_type, context_type)
    }}
    
    namespace parser {
    
    const impl::identifier_type& identifier()
    {
        return impl::identifier;
    }
    
    }
    
  7. identifier_def.hpp

    #pragma once
    #include <boost/spirit/home/x3.hpp>
    #include "identifier.hpp"
    
    namespace parser{ namespace impl{
    
        namespace x3=boost::spirit::x3;
    
        const identifier_type identifier = "identifier";    
    
        auto const identifier_def = x3::lexeme[
            ((x3::alpha | '_') >> *(x3::alnum | '_'))
        ];
    
        BOOST_SPIRIT_DEFINE(identifier)
    }}
    
  8. identifier.hpp

    #pragma once
    #include <boost/spirit/home/x3.hpp>
    
    namespace parser{ namespace impl{
        namespace x3=boost::spirit::x3;
    
        typedef x3::rule<class identifier_class, std::string> identifier_type;
    
        BOOST_SPIRIT_DECLARE(identifier_type)
    }}
    
    
    namespace parser{
        const impl::identifier_type& identifier();
    }
    
  9. main.cpp

    #include <boost/spirit/home/x3.hpp>
    #include "ast.hpp"
    #include "enum.hpp"
    
    namespace x3 = boost::spirit::x3;
    
    template<typename Parser, typename Attribute>
    bool test(const std::string& str, Parser&& p, Attribute&& attr)
    {
        using iterator_type = std::string::const_iterator;
        iterator_type in = str.begin();
        iterator_type end = str.end();
    
        bool ret = x3::phrase_parse(in, end, p, x3::ascii::space, attr);
        ret &= (in == end);
        return ret;
    
    }
    
    int main(){
        ast::Enum attr;
        test("enum foo{foo,bar}", parser::enum_parser(), attr);
        test("enum {foo,bar}", parser::enum_parser(), attr);    
    }
    

Is this a bug, am I missing something, or is this an expected behaviour?

EDIT: here is my repo with an example which throws an std::logic_error instead of an expectation_failure

Upvotes: 3

Views: 361

Answers (2)

Matt
Matt

Reputation: 20796

Here's a solution that worked for me, when the above workround does not.

Say you have files a.cpp, a.h, a_def.hpp, b.cpp, b.h, b_def.hpp, ... as recommended by the Boost.Spirit X3 docs.

The basic idea is to combine the *.cpp and *_def.hpp files into 1 file for each group. The *.h files can and should remain.

  1. ls *_def.hpp > parser_def.hpp (assuming parser_def.hpp doesn't already exist) and edit parser_def.hpp to #include the files in the correct order. Remove redundant lines (add a header guard, etc.) The goal is for parser_def.hpp to include the other files in the correct order.
  2. cat *.cpp > parser.cpp and edit parser.cpp to be syntactically correct, and replace all #include <*_def.hpp> lines with a single #include <parser_def.hpp> at the top. In your build file (e.g. make or cmake), replace the compilation of the *.cpp files with the single parser.cpp.
  3. You can get rid of the old *.cpp files.

You lose the convenience of separately compiled files, but it will avoid the Static Initialization Order Fiasco.

Upvotes: 2

sehe
sehe

Reputation: 393829

I've found the cause of the bug.

The bug is with the fact that the expect directive takes it subject parser by value, which is before the parser::impl::identifier initializer runs.

To visualize, imagine the static initializer for parser::impl::enum_parser running before parser::impl::identifier. This is valid for a compiler to do.

The copy, therefore, has an uninitialized name field, which fails as soon as the expectation point tries to construct the x3::expectation_failure with the which_ member, because constructing a std::string from a nullptr is illegal.

All in all, I fear the root cause here is Static Initialization Order Fiasco. I'll see whether I can fix it and submit a PR.

WORKAROUND:

An immediate workaround is to list the order of the source files in reverse, so that use comes after definition:

set(SOURCE_FILES 
    identifier.cpp
    enum.cpp 
    main.cpp 
)

Note that if this fixes it on your compiler (it does on mine) that is implementation defined. The standard does NOT specify the order of static initialization across compilation units.

Upvotes: 4

Related Questions