Reputation: 43
I need to load these values from INI file and print them in the application using C++ Boost Library. The sections have duplicate names. I have been restricted to using C++ Boost Library only.
numColors = 4
boardSize = 11
numSnails = 2
[initialization]
id = 0
row = 3
col = 4
orientation = 0
[initialization]
id = 1
row = 5
col = 0
orientation = 1
[color]
id = 0
nextColor = 1
deltaOrientation = +2
[color]
id = 1
nextColor = 2
deltaOrientation = +1
[color]
id = 2
nextColor = 3
deltaOrientation = -2
[color]
id = 3
nextColor = 0
deltaOrientation = -1
Upvotes: 3
Views: 1556
Reputation: 393249
I figured that for a random passer-by the Boost Spirit answer might seem complex/overkill.
I wanted to try again, since it's 2021, whether with C++17, we can do a reasonable job with just the standard library.
Turns out it's a lot more work. The Qi implementation takes 86 lines of code, but the standard library implementation takes 136 lines. Also, it took me a lot longer (several hours) to debug/write. In particular it was hard to get '='
, '['
, ']'
as token boundaries with std::istream&
. I used the ctype
facet approach from this answer: How do I iterate over cin line by line in C++?
I did leave in the
DebugPeeker
(20 lines) so you can perhaps understand it yourself.
The top level parse function looks sane and shows what I wanted to achieve: natural std::istream
extraction:
static Ast::File std_parse_game(std::string_view input) {
std::istringstream iss{std::string(input)};
using namespace Helpers;
if (Ast::File parsed; iss >> parsed)
return parsed;
throw std::runtime_error("Unable to parse game");
}
All the rest lives in namespace Helpers
:
static inline std::istream& operator>>(std::istream& is, Ast::File& v) {
for (section s; is >> s;) {
if (s.name == "parameters")
is >> v.parameters;
else if (s.name == "initialization")
is >> v.initializations.emplace_back();
else if (s.name == "color")
is >> v.colors.emplace_back();
else
is.setstate(std::ios::failbit);
}
if (is.eof())
is.clear();
return is;
}
So far, this is paying out well. The different section types are similar:
static inline std::istream& operator>>(std::istream& is, Ast::Parameters& v) {
return is
>> entry{"numColors", v.numColors}
>> entry{"boardSize", v.boardSize}
>> entry{"numSnails", v.numSnails};
}
static inline std::istream& operator>>(std::istream& is, Ast::Initialization& v) {
return is
>> entry{"id", v.id}
>> entry{"row", v.row}
>> entry{"col", v.col}
>> entry{"orientation", v.orientation};
}
static inline std::istream& operator>>(std::istream& is, Ast::Color& v) {
return is
>> entry{"id", v.id}
>> entry{"nextColor", v.nextColor}
>> entry{"deltaOrientation", v.deltaOrientation};
}
Now, if all was plain sailing like this, I would not be recommending Spirit. Now we get into conditional parsings.
The entry{"name", value}
formulations use a "manipulator type":
template <typename T> struct entry {
entry(std::string name, T& into) : _name(name), _into(into) {}
std::string _name;
T& _into;
friend std::istream& operator>>(std::istream& is, entry e) {
return is >> expect{e._name} >> expect{'='} >> e._into;
}
};
Similarly, sections are using expect
and token
:
struct section {
std::string name;
friend std::istream& operator>>(std::istream& is, section& s) {
if (is >> expect('['))
return is >> token{s.name} >> expect{']'};
return is;
}
};
The conditional is important to be able to detect EOF without putting the stream into a hard failed mode (is.bad()
!= is.fail()
).
expect
is built on top of token
:
template <typename T> struct expect {
expect(T expected) : _expected(expected) {}
T _expected;
friend std::istream& operator>>(std::istream& is, expect const& e) {
if (T actual; is >> token{actual})
if (actual != e._expected)
is.setstate(std::ios::failbit);
return is;
}
};
You will notice that there's much less error information. We just make the
stream fail()
in case an expected token is not found.
Here's where the real complexity comes. I didn't want to parse character by
character. But reading std::string
using operator>>
would only stop on
whitespace, meaning that a section name would "eat" the bracket: parameters]
instead of parameters
, and keys might eat the =
character if there was no
separating space.
In the answer linked above we learn how to build our own character classification locale facet:
// make sure =,[,] break tokens
struct mytoken_ctype : std::ctype<char> {
static auto const* get_table() {
static std::vector rc(table_size, std::ctype_base::mask());
rc[' '] = rc['\f'] = rc['\v'] = rc['\t'] = rc['\r'] = rc['\n'] =
std::ctype_base::space;
// crucial for us:
rc['='] = rc['['] = rc[']'] = std::ctype_base::space;
return rc.data();
}
mytoken_ctype() : std::ctype<char>(get_table()) {}
};
And then we need to use it but only if we parse a std::string
token. That
way, if we expect('=')
it will not skip over '='
because our facet calls it whitespace...
template <typename T> struct token {
token(T& into) : _into(into) {}
T& _into;
friend std::istream& operator>>(std::istream& is, token const& t) {
std::locale loc = is.getloc();
if constexpr (std::is_same_v<std::decay_t<T>, std::string>) {
loc = is.imbue(std::locale(std::locale(), new mytoken_ctype()));
}
try { is >> t._into; is.imbue(loc); }
catch (...) { is.imbue(loc); throw; }
return is;
}
};
I tried to keep it pretty condensed. Had I used proper formatting, we would have had more lines of code still :)
I used the same Ast types, so it made sense to test both implementations and compare the results for equality.
NOTES:
- On Compiler Explorer so we can enjoy
libfmt
for easy output- For comparison I used one C++20 feature to get compiler generated
operator==
#include <boost/spirit/home/qi.hpp>
#include <boost/fusion/include/io.hpp>
#include <fstream>
#include <sstream>
#include <iomanip>
#include <fmt/ranges.h>
#include <fmt/ostream.h>
namespace qi = boost::spirit::qi;
namespace Ast {
using Id = unsigned;
using Size = uint16_t; // avoiding char types for easy debug/output
using Coord = Size;
using ColorNumber = Size;
using Orientation = Size;
using Delta = signed;
struct Parameters {
Size numColors{}, boardSize{}, numSnails{};
bool operator==(Parameters const&) const = default;
};
struct Initialization {
Id id;
Coord row;
Coord col;
Orientation orientation;
bool operator==(Initialization const&) const = default;
};
struct Color {
Id id;
ColorNumber nextColor;
Delta deltaOrientation;
bool operator==(Color const&) const = default;
};
struct File {
Parameters parameters;
std::vector<Initialization> initializations;
std::vector<Color> colors;
bool operator==(File const&) const = default;
};
using boost::fusion::operator<<;
template <typename T>
static inline std::ostream& operator<<(std::ostream& os, std::vector<T> const& v) {
return os << fmt::format("vector<{}>{}",
boost::core::demangle(typeid(T).name()), v);
}
} // namespace Ast
BOOST_FUSION_ADAPT_STRUCT(Ast::Parameters, numColors, boardSize, numSnails)
BOOST_FUSION_ADAPT_STRUCT(Ast::Initialization, id, row, col, orientation)
BOOST_FUSION_ADAPT_STRUCT(Ast::Color, id, nextColor, deltaOrientation)
BOOST_FUSION_ADAPT_STRUCT(Ast::File, parameters, initializations, colors)
template <typename It>
struct GameParser : qi::grammar<It, Ast::File()> {
GameParser() : GameParser::base_type(start) {
using namespace qi;
start = skip(blank)[file];
auto section = [](const std::string& name) {
return copy('[' >> lexeme[lit(name)] >> ']' >> (+eol | eoi));
};
auto required = [](const std::string& name, auto value) {
return copy(lexeme[eps > lit(name)] > '=' > value >
(+eol | eoi));
};
file = parameters >
*initialization >
*color >
eoi; // must reach end of input
parameters = section("parameters") >
required("numColors", _size) >
required("boardSize", _size) >
required("numSnails", _size);
initialization = section("initialization") >
required("id", _id) >
required("row", _coord) >
required("col", _coord) >
required("orientation", _orientation);
color = section("color") >
required("id", _id) >
required("nextColor", _colorNumber) >
required("deltaOrientation", _delta);
BOOST_SPIRIT_DEBUG_NODES((file)(parameters)(initialization)(color))
}
private:
using Skipper = qi::blank_type;
qi::rule<It, Ast::File()> start;
qi::rule<It, Ast::File(), Skipper> file;
// sections
qi::rule<It, Ast::Parameters(), Skipper> parameters;
qi::rule<It, Ast::Initialization(), Skipper> initialization;
qi::rule<It, Ast::Color(), Skipper> color;
// value types
qi::uint_parser<Ast::Id> _id;
qi::uint_parser<Ast::Size> _size;
qi::uint_parser<Ast::Coord> _coord;
qi::uint_parser<Ast::ColorNumber> _colorNumber;
qi::uint_parser<Ast::Orientation> _orientation;
qi::int_parser<Ast::Delta> _delta;
};
static Ast::File qi_parse_game(std::string_view input) {
using SVI = std::string_view::const_iterator;
static const GameParser<SVI> parser{};
try {
Ast::File parsed;
if (qi::parse(input.begin(), input.end(), parser, parsed)) {
return parsed;
}
throw std::runtime_error("Unable to parse game");
} catch (qi::expectation_failure<SVI> const& ef) {
std::ostringstream oss;
auto where = ef.first - input.begin();
auto sol = 1 + input.find_last_of("\r\n", where);
auto lineno = 1 + std::count(input.begin(), input.begin() + sol, '\n');
auto col = 1 + where - sol;
auto llen = input.substr(sol).find_first_of("\r\n");
oss << "input.txt:" << lineno << ":" << col << " Expected: " << ef.what_ << "\n"
<< " note: " << input.substr(sol, llen) << "\n"
<< " note:" << std::setw(col) << "" << "^--- here";
throw std::runtime_error(oss.str());
}
}
namespace Helpers {
struct DebugPeeker {
DebugPeeker(std::istream& is, int line) : is(is), line(line) { dopeek(); }
~DebugPeeker() { dopeek(); }
private:
std::istream& is;
int line;
void dopeek() const {
std::char_traits<char> t;
auto ch = is.peek();
std::cerr << "DEBUG " << line << " Peek: ";
if (std::isgraph(ch))
std::cerr << "'" << t.to_char_type(ch) << "'";
else
std::cerr << "<" << ch << ">";
std::cerr << " " << std::boolalpha << is.good() << "\n";
}
};
#define DEBUG_PEEK(is) // Peeker _peek##__LINE__(is, __LINE__);
// make sure =,[,] break tokens
struct mytoken_ctype : std::ctype<char> {
static auto const* get_table() {
static std::vector rc(table_size, std::ctype_base::mask());
rc[' '] = rc['\f'] = rc['\v'] = rc['\t'] = rc['\r'] = rc['\n'] =
std::ctype_base::space;
// crucial for us:
rc['='] = rc['['] = rc[']'] = std::ctype_base::space;
return rc.data();
}
mytoken_ctype() : std::ctype<char>(get_table()) {}
};
template <typename T> struct token {
token(T& into) : _into(into) {}
T& _into;
friend std::istream& operator>>(std::istream& is, token const& t) {
DEBUG_PEEK(is);
std::locale loc = is.getloc();
if constexpr (std::is_same_v<std::decay_t<T>, std::string>) {
loc = is.imbue(std::locale(std::locale(), new mytoken_ctype()));
}
try { is >> t._into; is.imbue(loc); }
catch (...) { is.imbue(loc); throw; }
return is;
}
};
template <typename T> struct expect {
expect(T expected) : _expected(expected) {}
T _expected;
friend std::istream& operator>>(std::istream& is, expect const& e) {
DEBUG_PEEK(is);
if (T actual; is >> token{actual})
if (actual != e._expected)
is.setstate(std::ios::failbit);
return is;
}
};
template <typename T> struct entry {
entry(std::string name, T& into) : _name(name), _into(into) {}
std::string _name;
T& _into;
friend std::istream& operator>>(std::istream& is, entry e) {
DEBUG_PEEK(is);
return is >> expect{e._name} >> expect{'='} >> e._into;
}
};
struct section {
std::string name;
friend std::istream& operator>>(std::istream& is, section& s) {
DEBUG_PEEK(is);
if (is >> expect('['))
return is >> token{s.name} >> expect{']'};
return is;
}
};
static inline std::istream& operator>>(std::istream& is, Ast::Parameters& v) {
DEBUG_PEEK(is);
return is
>> entry{"numColors", v.numColors}
>> entry{"boardSize", v.boardSize}
>> entry{"numSnails", v.numSnails};
}
static inline std::istream& operator>>(std::istream& is, Ast::Initialization& v) {
DEBUG_PEEK(is);
return is
>> entry{"id", v.id}
>> entry{"row", v.row}
>> entry{"col", v.col}
>> entry{"orientation", v.orientation};
}
static inline std::istream& operator>>(std::istream& is, Ast::Color& v) {
DEBUG_PEEK(is);
return is
>> entry{"id", v.id}
>> entry{"nextColor", v.nextColor}
>> entry{"deltaOrientation", v.deltaOrientation};
}
static inline std::istream& operator>>(std::istream& is, Ast::File& v) {
DEBUG_PEEK(is);
for (section s; is >> s;) {
if (s.name == "parameters")
is >> v.parameters;
else if (s.name == "initialization")
is >> v.initializations.emplace_back();
else if (s.name == "color")
is >> v.colors.emplace_back();
else
is.setstate(std::ios::failbit);
}
if (is.eof())
is.clear();
return is;
}
}
static Ast::File std_parse_game(std::string_view input) {
std::istringstream iss{std::string(input)};
using namespace Helpers;
if (Ast::File parsed; iss >> parsed)
return parsed;
throw std::runtime_error("Unable to parse game");
}
std::string read_file(const std::string& name) {
std::ifstream ifs(name);
return std::string(std::istreambuf_iterator<char>(ifs), {});
}
int main() {
std::string const game_save = read_file("input.txt");
Ast::File g1, g2;
try {
std::cout << "Qi: " << (g1 = qi_parse_game(game_save)) << "\n";
} catch (std::exception const& e) { std::cerr << e.what() << "\n"; }
try {
std::cout << "std: " << (g2 = std_parse_game(game_save)) << "\n";
} catch (std::exception const& e) { std::cerr << e.what() << "\n"; }
std::cout << "Equal: " << std::boolalpha << (g1 == g2) << "\n";
}
To my great relief, the parsers agree on the data:
Qi: ((4 11 2)
vector<Ast::Initialization>{(0 3 4 0), (1 5 0 1)}
vector<Ast::Color>{(0 1 2), (1 2 1), (2 3 -2), (3 0 -1)})
std: ((4 11 2)
vector<Ast::Initialization>{(0 3 4 0), (1 5 0 1)}
vector<Ast::Color>{(0 1 2), (1 2 1), (2 3 -2), (3 0 -1)})
Equal: true
Although this answer is "standard" and "portable" it has some drawbacks.
For example it is certainly not easier to get right, it has virtually no debug options or error reporting, it doesn't validate the input format as much. E.g. it will still read this unholy mess and accept it:
[parameters] numColors=999 boardSize=999 numSnails=999
[color] id=0 nextColor=1 deltaOrientation=+2 [color] id=1 nextColor=2
deltaOrientation=+1 [
initialization] id=1 row=5 col=0 orientation=1
[color] id=2 nextColor=3 deltaOrientation=-2
[parameters] numColors=4 boardSize=11 numSnails=2
[color] id=3 nextColor=0 deltaOrientation=-1
[initialization] id=0 row=3 col=4 orientation=0
If your input format is not stable and computer-written, I would highly recommend against the standard-library approach because it will lead to hard-to-diagnose problems and just horrible UX (don't make your users want to throw their computer out of the window because of unhelpful error messages like "Unable to parse game data").
Otherwise, you might. For one thing, it'll be faster to compile.
Upvotes: 3
Reputation: 393249
In short, this is not INI format at all. It just very loosely resembles it. Which is nice.
You don't specify a lot, so I'm going to make assumptions.
I'm going to, for simplicity, assume that
Non-essential deductions (could be used to add more validation):
To represent the file, I'd make:
namespace Ast {
struct Initialization {
unsigned id, row, col, orientation;
};
struct Color {
unsigned id, nextColor;
int deltaOrientation;
};
struct File {
unsigned numColors, boardSize, numSnails;
std::vector<Initialization> initializations;
std::vector<Color> colors;
};
}
That's the simplest I can think of.
Is a nice job for Boost Spirit. If we adapt the data structures as Fusion Sequences:
BOOST_FUSION_ADAPT_STRUCT(Ast::Initialization, id, row, col, orientation)
BOOST_FUSION_ADAPT_STRUCT(Ast::Color, id, nextColor, deltaOrientation)
BOOST_FUSION_ADAPT_STRUCT(Ast::File, numColors, boardSize, numSnails,
initializations, colors)
We can basically let the parser "write itself":
template <typename It>
struct GameParser : qi::grammar<It, Ast::File()> {
GameParser() : GameParser::base_type(start) {
using namespace qi;
start = skip(blank)[file];
auto section = [](std::string name) {
return copy('[' >> lexeme[lit(name)] >> ']' >> (+eol | eoi));
};
auto required = [](std::string name) {
return copy(lexeme[eps > lit(name)] > '=' > auto_ >
(+eol | eoi));
};
file =
required("numColors") >
required("boardSize") >
required("numSnails") >
*initialization >
*color >
eoi; // must reach end of input
initialization = section("initialization") >
required("id") >
required("row") >
required("col") >
required("orientation");
color = section("color") >
required("id") >
required("nextColor") >
required("deltaOrientation");
BOOST_SPIRIT_DEBUG_NODES((file)(initialization)(color))
}
private:
using Skipper = qi::blank_type;
qi::rule<It, Ast::File()> start;
qi::rule<It, Ast::File(), Skipper> file;
qi::rule<It, Ast::Initialization(), Skipper> initialization;
qi::rule<It, Ast::Color(), Skipper> color;
};
Because of the many assumptions we've made we littered the place with expectation points (operator>
sequences, instead of operator>>
). This means we get "helpful" error messages on invalid input, like
Expected: nextColor
Expected: =
Expected: <eoi>
See also BONUS section below that improves this a lot
Testing it, we will read the file first and then parse it using that parser:
std::string read_file(std::string name) {
std::ifstream ifs(name);
return std::string(std::istreambuf_iterator<char>(ifs), {});
}
static Ast::File parse_game(std::string_view input) {
using SVI = std::string_view::const_iterator;
static const GameParser<SVI> parser{};
try {
Ast::File parsed;
if (qi::parse(input.begin(), input.end(), parser, parsed)) {
return parsed;
}
throw std::runtime_error("Unable to parse game");
} catch (qi::expectation_failure<SVI> const& ef) {
std::ostringstream oss;
oss << "Expected: " << ef.what_;
throw std::runtime_error(oss.str());
}
}
A lot could be improved, but for now it works and parses your input:
int main() {
std::string game_save = read_file("input.txt");
Ast::File data = parse_game(game_save);
}
The absense of output means success.
Some improvements, instead of using auto_
to generate the right parser for the type, we can make that explicit:
namespace Ast {
using Id = unsigned;
using Size = uint8_t;
using Coord = Size;
using ColorNumber = Size;
using Orientation = Size;
using Delta = signed;
struct Initialization {
Id id;
Coord row;
Coord col;
Orientation orientation;
};
struct Color {
Id id;
ColorNumber nextColor;
Delta deltaOrientation;
};
struct File {
Size numColors{}, boardSize{}, numSnails{};
std::vector<Initialization> initializations;
std::vector<Color> colors;
};
} // namespace Ast
And then in the parser define the analogous:
qi::uint_parser<Ast::Id> _id;
qi::uint_parser<Ast::Size> _size;
qi::uint_parser<Ast::Coord> _coord;
qi::uint_parser<Ast::ColorNumber> _colorNumber;
qi::uint_parser<Ast::Orientation> _orientation;
qi::int_parser<Ast::Delta> _delta;
Which we then use e.g.:
initialization = section("initialization") >
required("id", _id) >
required("row", _coord) >
required("col", _coord) >
required("orientation", _orientation);
Now we can improve the error messages to be e.g.:
input.txt:2:13 Expected: <unsigned-integer>
note: boardSize = (11)
note: ^--- here
Or
input.txt:16:19 Expected: <alternative><eol><eoi>
note: nextColor = 1 deltaOrientation = +2
note: ^--- here
Full Code, Live On Coliru
//#define BOOST_SPIRIT_DEBUG
#include <boost/spirit/home/qi.hpp>
#include <fstream>
#include <sstream>
#include <iomanip>
namespace qi = boost::spirit::qi;
namespace Ast {
using Id = unsigned;
using Size = uint8_t;
using Coord = Size;
using ColorNumber = Size;
using Orientation = Size;
using Delta = signed;
struct Initialization {
Id id;
Coord row;
Coord col;
Orientation orientation;
};
struct Color {
Id id;
ColorNumber nextColor;
Delta deltaOrientation;
};
struct File {
Size numColors{}, boardSize{}, numSnails{};
std::vector<Initialization> initializations;
std::vector<Color> colors;
};
} // namespace Ast
BOOST_FUSION_ADAPT_STRUCT(Ast::Initialization, id, row, col, orientation)
BOOST_FUSION_ADAPT_STRUCT(Ast::Color, id, nextColor, deltaOrientation)
BOOST_FUSION_ADAPT_STRUCT(Ast::File, numColors, boardSize, numSnails,
initializations, colors)
template <typename It>
struct GameParser : qi::grammar<It, Ast::File()> {
GameParser() : GameParser::base_type(start) {
using namespace qi;
start = skip(blank)[file];
auto section = [](const std::string& name) {
return copy('[' >> lexeme[lit(name)] >> ']' >> (+eol | eoi));
};
auto required = [](const std::string& name, auto value) {
return copy(lexeme[eps > lit(name)] > '=' > value >
(+eol | eoi));
};
file =
required("numColors", _size) >
required("boardSize", _size) >
required("numSnails", _size) >
*initialization >
*color >
eoi; // must reach end of input
initialization = section("initialization") >
required("id", _id) >
required("row", _coord) >
required("col", _coord) >
required("orientation", _orientation);
color = section("color") >
required("id", _id) >
required("nextColor", _colorNumber) >
required("deltaOrientation", _delta);
BOOST_SPIRIT_DEBUG_NODES((file)(initialization)(color))
}
private:
using Skipper = qi::blank_type;
qi::rule<It, Ast::File()> start;
qi::rule<It, Ast::File(), Skipper> file;
qi::rule<It, Ast::Initialization(), Skipper> initialization;
qi::rule<It, Ast::Color(), Skipper> color;
qi::uint_parser<Ast::Id> _id;
qi::uint_parser<Ast::Size> _size;
qi::uint_parser<Ast::Coord> _coord;
qi::uint_parser<Ast::ColorNumber> _colorNumber;
qi::uint_parser<Ast::Orientation> _orientation;
qi::int_parser<Ast::Delta> _delta;
};
std::string read_file(const std::string& name) {
std::ifstream ifs(name);
return std::string(std::istreambuf_iterator<char>(ifs), {});
}
static Ast::File parse_game(std::string_view input) {
using SVI = std::string_view::const_iterator;
static const GameParser<SVI> parser{};
try {
Ast::File parsed;
if (qi::parse(input.begin(), input.end(), parser, parsed)) {
return parsed;
}
throw std::runtime_error("Unable to parse game");
} catch (qi::expectation_failure<SVI> const& ef) {
std::ostringstream oss;
auto where = ef.first - input.begin();
auto sol = 1 + input.find_last_of("\r\n", where);
auto lineno = 1 + std::count(input.begin(), input.begin() + sol, '\n');
auto col = 1 + where - sol;
auto llen = input.substr(sol).find_first_of("\r\n");
oss << "input.txt:" << lineno << ":" << col << " Expected: " << ef.what_ << "\n"
<< " note: " << input.substr(sol, llen) << "\n"
<< " note:" << std::setw(col) << "" << "^--- here";
throw std::runtime_error(oss.str());
}
}
int main() {
std::string game_save = read_file("input.txt");
try {
Ast::File data = parse_game(game_save);
} catch (std::exception const& e) {
std::cerr << e.what() << "\n";
}
}
Look here for various failure modes and BOOST_SPIRIT_DEBUG ouput:
Upvotes: 1