Reputation: 700
I'm trying to serialize an object of the following type
std::unordered_map<std::vector<Card>, std::unordered_map<InfosetHistory, Node>>& nodeMap
Card
is a struct, InfosetHistory
and Node
are classes which use some other structs as member variables. I have created serialize
functions for all classes that need it. For example here's the one for Card
:
struct Card {
int rank;
int suit;
...
template<class Archive>
void serialize(Archive& ar, const unsigned int version) {
ar & rank;
ar & suit;
}
};
I serialize nodeMap as follows:
std::ofstream file("Strategy" + std::to_string(iteration) + ".bin");
boost::archive::binary_oarchive archive(file);
archive << nodeMap;
file.close();
and deserialize separately like this (Currently choosing to deserialize "Strategy0.bin"):
std::ifstream ifs("Strategy0.bin");
std::unordered_map<std::vector<Card>, std::unordered_map<InfosetHistory, Node>> nodeMap;
if (ifs.good()) {
boost::archive::binary_iarchive ia(ifs);
ia >> nodeMap;
}
When I run the program to create and serialize nodeMap, I am always able to serialize with no issues. The respective .bin files are created, and their sizes seem appropriate for the data I expect them to store.
When I run the program to deserialize nodeMap, however, if the nodeMap isn't that large, I don't have issues, but if it is large, I will get the following error:
terminate called after throwing an instance of 'boost::archive::archive_exception'
what(): input stream error
I assume that this is not actually because of the nodeMap being large, and more because there's a probability of the code creating an entry that is somehow causing problems, and the more entries that are added the greater the probability of running into problems. I read in the Boost documentation that this kind of error can be created because of uninitialized data. I don't believe I have uninitialized data, but I'm not sure how to make sure of that.
In general, I'm unsure how to go about debugging this sort of problem. Any help would be appreciated.
Note: I tried very hard to create a minimal reproducible example, but all the examples I created didn't produce the issue. It's only when I create this sort of object in my program, and add thousands of entries that I run into this kind of problem.
EDIT @sehe asked for some more code. Here are all the relevant sections of the classes and structs relevant to the serialized object:
Note that these classes and structs are in separate files. InfosetHistory
and DrawAction
are declared in InfosetHistory.h
, Node
is declared in Node.h
, and Card
is declared in GRUtil.h
.
@sehe also mentioned that I don't mention the type serialized above. The type being serialized is a reference to the object I'm trying to serialize: std::unordered_map<std::vector<Card>, std::unordered_map<InfosetHistory, Node>>& nodeMap
.
EDIT 2 I have managed to create a minimal reproducible example using the code @sehe provided below. Using my code, I created a NodeMap I knew would produce the deserialization error, and I printed all the data to a text file called "StrategyWritten0.txt". In this reproducible example, I input all the data from that text file to create a NodeMap, serialize all the data in the resulting NodeMap, and then attempt to deserialize the NodeMap. I get the following output:
Successfully created Node Map
Serialized Node Map
terminate called after throwing an instance of 'boost::archive::archive_exception'
what(): input stream error
Here is the file:
https://drive.google.com/file/d/1y4FLgi7f-XWJ-igRq_tItK-pXFDEUbhg/view?usp=sharing
And here is the code:
Upvotes: 2
Views: 1411
Reputation: 393009
After struggling with this for days (installing MingW on a VM and debugging into the nitty-gritty details) I narrowed it down to some special condition happening after the 25th element of an associative container. It was shortly after that when I had the brainwave:
Yes. Windows line endings ruined a good few days for a couple of people. Again.
I always write std::ios::binary
flags on my archive files, but in this particular case I hadn't spotted they were missing.
Adding them will fix things:
void writeBinaryStrategy(std::string const& fname, NodeMap const& nodeMap) {
std::ofstream f(fname, std::ios::binary);
boost::archive::binary_oarchive archive(f);
archive << nodeMap;
}
NodeMap readBinaryStrategy(std::string const& fname) {
NodeMap nodeMap;
std::ifstream ifs(fname, std::ios::binary);
boost::archive::binary_iarchive ia(ifs);
ia >> nodeMap;
return nodeMap;
}
In the process of delving into the details, I came up with a rigorous roundtrip tester application. The code is ~unlive~ on Coliru, so I put up a Gist with the files:
File test.cpp
#include <iostream>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/archive/binary_iarchive.hpp>
#include <boost/serialization/serialization.hpp>
#include <boost/serialization/unordered_map.hpp>
#include <boost/serialization/map.hpp>
#include <boost/serialization/vector.hpp>
#include <boost/serialization/array.hpp>
#include <boost/functional.hpp>
#include <boost/container_hash/hash.hpp>
using boost::hash_value;
struct Card {
int rank, suit;
Card(int rank = 0, int suit = 0) : rank(rank), suit(suit) {}
template<class Archive>
void serialize(Archive& ar, unsigned /*version*/) {
ar & rank;
ar & suit;
}
friend size_t hash_value(Card const& c) {
auto v = hash_value(c.rank);
boost::hash_combine(v, hash_value(c.suit));
return v;
}
auto tied() const { return std::tie(rank, suit); }
bool operator<(Card const& rhs) const { return tied() < rhs.tied(); }
bool operator==(Card const& rhs) const { return tied() == rhs.tied(); }
};
struct DrawAction {
bool fromDeck;
Card card;
explicit DrawAction(bool fromDeck = false, Card card = {})
: fromDeck(fromDeck), card(card) {}
template<class Archive>
void serialize(Archive& ar, unsigned /*version*/) {
ar & fromDeck;
ar & card;
}
friend size_t hash_value(DrawAction const& da) {
auto v = hash_value(da.fromDeck);
boost::hash_combine(v, hash_value(da.card));
return v;
}
auto tied() const { return std::tie(fromDeck, card); }
bool operator<(DrawAction const& rhs) const { return tied() < rhs.tied(); }
bool operator==(DrawAction const& rhs) const { return tied() == rhs.tied(); }
};
using Cards = std::vector<Card>;
using Hand = std::array<Card, 10>;
using Draws = std::vector<DrawAction>;
class InfosetHistory {
public:
Card initialDiscardPile;
Hand initialHand;
Draws playerDrawActions;
Cards playerDiscardActions;
Draws opponentDrawActions;
Cards opponentDiscardActions;
InfosetHistory(
Card initialDiscardPile = {},
Hand hand = {},
Draws playerDrawActions = {},
Cards playerDiscardActions = {},
Draws opponentDrawActions = {},
Cards opponentDiscardActions = {}
) : initialDiscardPile(initialDiscardPile),
initialHand(std::move(hand)),
playerDrawActions(std::move(playerDrawActions)),
playerDiscardActions(std::move(playerDiscardActions)),
opponentDrawActions(std::move(opponentDrawActions)),
opponentDiscardActions(std::move(opponentDiscardActions)) {}
template<class Archive>
void serialize(Archive& ar, const unsigned int /*version*/) {
ar & initialDiscardPile & initialHand
& playerDrawActions & playerDiscardActions
& opponentDrawActions & opponentDiscardActions;
}
friend size_t hash_value(InfosetHistory const& ish) {
auto v = hash_value(ish.initialDiscardPile);
auto combine = [&v](auto& range) { boost::hash_range(v, begin(range), end(range)); };
combine(ish.initialHand);
combine(ish.playerDrawActions);
combine(ish.playerDiscardActions);
combine(ish.opponentDrawActions);
combine(ish.opponentDiscardActions);
return v;
}
auto tied() const { return std::tie(initialDiscardPile, initialHand,
playerDrawActions, playerDiscardActions, opponentDrawActions,
opponentDiscardActions); }
bool operator<(InfosetHistory const& rhs) const { return tied() < rhs.tied(); }
bool operator==(InfosetHistory const& rhs) const { return tied() == rhs.tied(); }
};
class Node {
public:
Cards allowedActions;
unsigned int NUM_ACTIONS{};
std::vector<double> regretSum;
std::vector<double> strategySum;
unsigned char phase{};
explicit Node(std::vector<Card> allowedActions = {},
unsigned int NUM_ACTIONS = 0,
std::vector<double> regretSum = {},
std::vector<double> strategySum = {},
unsigned char phase = 0
) : allowedActions(std::move(allowedActions)),
NUM_ACTIONS(NUM_ACTIONS),
regretSum(std::move(regretSum)),
strategySum(std::move(strategySum)),
phase(phase) {}
template<class Archive>
void serialize(Archive& ar, unsigned /*version*/) {
ar & allowedActions
& NUM_ACTIONS
& regretSum & strategySum & phase;
}
auto tied() const { return std::tie(allowedActions, NUM_ACTIONS, regretSum, strategySum, phase); }
bool operator<(Node const& rhs) const { return tied() < rhs.tied(); }
bool operator==(Node const& rhs) const { return tied() == rhs.tied(); }
};
#include <map>
#include <fstream>
#if defined(ORDERED_MAP)
template<typename K, typename V>
using htable = std::map<K, V>;
template <typename K, typename V>
static inline bool check_insert(htable<K, V>& t, K k, V v) {
return t.emplace(std::move(k), std::move(v)).second;
}
#elif defined(UNORDERED_MAP)
template<typename K, typename V>
using htable = std::unordered_map<K, V, boost::hash<K> >;
template <typename K, typename V>
static inline bool check_insert(htable<K, V>& t, K k, V v) {
return t.emplace(std::move(k), std::move(v)).second;
}
#elif defined(INPUT_PRESERVING) // retain exact input order
#include <boost/multi_index_container.hpp>
#include <boost/multi_index/sequenced_index.hpp>
#include <boost/multi_index/hashed_index.hpp>
#include <boost/multi_index/member.hpp>
namespace bmi = boost::multi_index;
template<typename K, typename V, typename P = std::pair<K, V> >
using htable = bmi::multi_index_container<
P,
bmi::indexed_by<
bmi::sequenced<>,
bmi::hashed_unique<bmi::member<P, K, &P::first>, boost::hash<K> >
>
>;
template <typename K, typename V>
static inline bool check_insert(htable<K, V>& t, K k, V v) {
return t.insert(t.end(), std::make_pair(std::move(k), std::move(v))).second;
}
#endif
using NodeMap = htable<Hand, htable<InfosetHistory, Node>>;
NodeMap readTextStrategy(std::string const& fname);
NodeMap readBinaryStrategy(std::string const& fname);
void writeTextStrategy(std::string const& fname, NodeMap const& nodeMap);
void writeBinaryStrategy(std::string const& fname, NodeMap const& nodeMap);
int main() {
auto const original = readTextStrategy("StrategyWritten0.txt");
NodeMap bin = original, txt;
for (int i = 1; i<5; ++i) {
auto const fname = "roundtrip" + std::to_string(i);
writeBinaryStrategy(fname + ".bin", bin);
writeTextStrategy(fname + ".txt", bin);
bin = readBinaryStrategy(fname + ".bin");
txt = readTextStrategy(fname + ".txt");
std::cout << "text roundtrip " << i << " is " << (txt == original?"equal":"different") << "\n";
std::cout << "bin roundtrip " << i << " is " << (bin == original?"equal":"different") << "\n";
}
}
void writeBinaryStrategy(std::string const& fname, NodeMap const& nodeMap) {
std::ofstream f(fname, std::ios::binary);
boost::archive::binary_oarchive archive(f);
archive << nodeMap;
}
NodeMap readBinaryStrategy(std::string const& fname) {
NodeMap nodeMap;
std::ifstream ifs(fname, std::ios::binary);
boost::archive::binary_iarchive ia(ifs);
ia >> nodeMap;
return nodeMap;
}
#include <iomanip>
#include <boost/lexical_cast.hpp> // full precision see https://stackoverflow.com/a/48085309/85371
namespace TextSerialization {
#if defined(__MINGW32__) || defined(WIN32)
static constexpr char const* CRLF = "\n";
#else
static constexpr char const* CRLF = "\r\n";
#endif
static inline std::ostream& operator<<(std::ostream& os, Card const& c) {
return os << c.rank << CRLF << c.suit;
}
static inline std::ostream& operator<<(std::ostream& os, DrawAction const& da) {
return os << da.fromDeck << CRLF << da.card;
}
template <typename T, size_t N>
static inline std::ostream& operator<<(std::ostream& os, std::array<T, N> const& from) {
auto n = N;
for (auto& el : from)
os << el << (--n?CRLF:"");
return os;
}
template <typename... T>
static inline std::ostream& operator<<(std::ostream& os, std::vector<T...> const& from) {
os << from.size();
for (auto& el : from)
os << CRLF << el;
return os;
}
template <typename K, typename V>
static inline std::ostream& operator<<(std::ostream& os, htable<K, V> const& from) {
auto n = from.size();
os << n;
for (auto& [k,v] : from)
os << CRLF << k << CRLF << v;
return os;
}
static inline std::ostream& operator<<(std::ostream& os, InfosetHistory const& ish) {
return os
<< ish.initialHand << CRLF << ish.initialDiscardPile << CRLF
<< ish.playerDrawActions << CRLF << ish.playerDiscardActions << CRLF
<< ish.opponentDrawActions << CRLF << ish.opponentDiscardActions;
}
static inline std::ostream& operator<<(std::ostream& os, Node const& n) {
assert(n.NUM_ACTIONS == n.regretSum.size());
assert(n.NUM_ACTIONS == n.strategySum.size());
os << n.allowedActions << CRLF
<< n.NUM_ACTIONS << CRLF;
for (auto& v: {n.regretSum, n.strategySum})
for (auto& el: v)
os << boost::lexical_cast<std::string>(el) << CRLF;
return os << n.phase;
}
}
namespace TextDeserialization {
template <typename Cont>
static inline void read_n(std::istream& is, size_t n, Cont& into) {
while (n--)
is >> *into.emplace(into.end());
}
static inline std::istream& operator>>(std::istream& is, Card& c) {
return is >> c.rank >> c.suit;
}
static inline std::istream& operator>>(std::istream& is, DrawAction& da) {
return is >> da.fromDeck >> da.card;
}
template <typename T, size_t N>
static inline std::istream& operator>>(std::istream& is, std::array<T, N>& into) {
for (auto& el : into)
is >> el;
return is;
}
template <typename... T>
static inline std::istream& operator>>(std::istream& is, std::vector<T...>& into) {
size_t n;
is >> n;
read_n(is, n, into);
return is;
}
template <typename K, typename V>
static inline std::istream& operator>>(std::istream& is, htable<K, V>& into) {
size_t n;
is >> n;
K k; V v;
while (n--) {
if (is >> k >> v && !check_insert(into, std::move(k), std::move(v)))
throw std::range_error("duplicate key");
}
return is;
}
static inline std::istream& operator>>(std::istream& is, InfosetHistory& ish) {
return is
>> ish.initialHand >> ish.initialDiscardPile
>> ish.playerDrawActions >> ish.playerDiscardActions
>> ish.opponentDrawActions >> ish.opponentDiscardActions;
}
static inline std::istream& operator>>(std::istream& is, Node& n) {
is >> n.allowedActions;
is >> n.NUM_ACTIONS;
read_n(is, n.NUM_ACTIONS, n.regretSum);
read_n(is, n.NUM_ACTIONS, n.strategySum);
return is >> n.phase;
}
}
void writeTextStrategy(std::string const& fname, NodeMap const& nodeMap) {
using namespace TextSerialization;
std::ofstream os(fname);
os << nodeMap << CRLF;
}
NodeMap readTextStrategy(std::string const& fname) {
using namespace TextDeserialization;
std::ifstream is(fname);
NodeMap nodeMap;
is >> nodeMap;
return nodeMap;
}
File CMakeLists.txt
PROJECT(work)
CMAKE_MINIMUM_REQUIRED(VERSION 3.16)
SET(CMAKE_CXX_STANDARD 17)
SET(CMAKE_CXX_FLAGS "-g -O0 -L .")
LINK_DIRECTORIES("C:\\boost\\lib")
LINK_LIBRARIES(boost_serialization-mgw8-mt-d-x64-1_73)
INCLUDE_DIRECTORIES("C:\\boost\\include\\boost-1_73")
ADD_EXECUTABLE(ordered test.cpp)
TARGET_COMPILE_DEFINITIONS(ordered PRIVATE "-DORDERED_MAP")
ADD_EXECUTABLE(unordered test.cpp)
TARGET_COMPILE_DEFINITIONS(unordered PRIVATE "-DUNORDERED_MAP")
ADD_EXECUTABLE(multi-index test.cpp)
TARGET_COMPILE_DEFINITIONS(multi-index PRIVATE "-DINPUT_PRESERVING")
Build:
mkdir -p build && (cd build; cmake ..; cmake --build .)
When tested with the input-order preserving datastructure (multi-index
, with -DINPUT_PRESERVING
):
./build/multi-index.exe
md5sum roundtrip*.txt StrategyWritten0.txt
Prints
text roundtrip 1 is equal
bin roundtrip 1 is equal
text roundtrip 2 is equal
bin roundtrip 2 is equal
text roundtrip 3 is equal
bin roundtrip 3 is equal
text roundtrip 4 is equal
bin roundtrip 4 is equal
d62dd8fe217595f2e069eabf54de479a *roundtrip1.txt
d62dd8fe217595f2e069eabf54de479a *roundtrip2.txt
d62dd8fe217595f2e069eabf54de479a *roundtrip3.txt
d62dd8fe217595f2e069eabf54de479a *roundtrip4.txt
d62dd8fe217595f2e069eabf54de479a *StrategyWritten0.txt
I serialize nodeMap as follows:
std::ofstream file("Strategy" + std::to_string(iteration) + ".bin"); boost::archive::binary_oarchive archive(file); archive << nodeMap; file.close();
That's risking incomplete archive, which could be your problem. You close the file before the archive destructor ran. Better:
I serialize nodeMap as follows:
{
std::ofstream file("Strategy" + std::to_string(iteration) + ".bin");
boost::archive::binary_oarchive archive(file);
archive << nodeMap;
}
That way, the destructor of the archive runs before that of the file (as a bonus you don't have to manually close it).
(Destructors run in opposite order of construction).
Other things to keep in mind
Upvotes: 5