Bhaskar
Bhaskar

Reputation: 700

How to debug boost archive exception input stream error

I'm trying to serialize an object of the following type

std::unordered_map<std::vector<Card>, std::unordered_map<InfosetHistory, Node>>& nodeMap

Card is a struct, InfosetHistory and Node are classes which use some other structs as member variables. I have created serialize functions for all classes that need it. For example here's the one for Card:

struct Card {
    int rank;
    int suit;
    ...

    template<class Archive>
    void serialize(Archive& ar, const unsigned int version) {
        ar & rank;
        ar & suit;
    }
};

I serialize nodeMap as follows:

std::ofstream file("Strategy" + std::to_string(iteration) + ".bin");
boost::archive::binary_oarchive archive(file);
archive << nodeMap;
file.close();

and deserialize separately like this (Currently choosing to deserialize "Strategy0.bin"):

std::ifstream ifs("Strategy0.bin");
std::unordered_map<std::vector<Card>, std::unordered_map<InfosetHistory, Node>> nodeMap;
if (ifs.good()) {
    boost::archive::binary_iarchive ia(ifs);
    ia >> nodeMap;
}

When I run the program to create and serialize nodeMap, I am always able to serialize with no issues. The respective .bin files are created, and their sizes seem appropriate for the data I expect them to store.

When I run the program to deserialize nodeMap, however, if the nodeMap isn't that large, I don't have issues, but if it is large, I will get the following error:

terminate called after throwing an instance of 'boost::archive::archive_exception'
  what():  input stream error

I assume that this is not actually because of the nodeMap being large, and more because there's a probability of the code creating an entry that is somehow causing problems, and the more entries that are added the greater the probability of running into problems. I read in the Boost documentation that this kind of error can be created because of uninitialized data. I don't believe I have uninitialized data, but I'm not sure how to make sure of that.

In general, I'm unsure how to go about debugging this sort of problem. Any help would be appreciated.

Note: I tried very hard to create a minimal reproducible example, but all the examples I created didn't produce the issue. It's only when I create this sort of object in my program, and add thousands of entries that I run into this kind of problem.

EDIT @sehe asked for some more code. Here are all the relevant sections of the classes and structs relevant to the serialized object:

https://pastebin.com/xPE8h8a3

Note that these classes and structs are in separate files. InfosetHistory and DrawAction are declared in InfosetHistory.h, Node is declared in Node.h, and Card is declared in GRUtil.h.

@sehe also mentioned that I don't mention the type serialized above. The type being serialized is a reference to the object I'm trying to serialize: std::unordered_map<std::vector<Card>, std::unordered_map<InfosetHistory, Node>>& nodeMap.

EDIT 2 I have managed to create a minimal reproducible example using the code @sehe provided below. Using my code, I created a NodeMap I knew would produce the deserialization error, and I printed all the data to a text file called "StrategyWritten0.txt". In this reproducible example, I input all the data from that text file to create a NodeMap, serialize all the data in the resulting NodeMap, and then attempt to deserialize the NodeMap. I get the following output:

Successfully created Node Map
Serialized Node Map
terminate called after throwing an instance of 'boost::archive::archive_exception'
  what():  input stream error

Here is the file:

https://drive.google.com/file/d/1y4FLgi7f-XWJ-igRq_tItK-pXFDEUbhg/view?usp=sharing

And here is the code:

https://pastebin.com/FebQqssx

Upvotes: 2

Views: 1411

Answers (1)

sehe
sehe

Reputation: 393009

UPDATE

After struggling with this for days (installing MingW on a VM and debugging into the nitty-gritty details) I narrowed it down to some special condition happening after the 25th element of an associative container. It was shortly after that when I had the brainwave:

enter image description here

Yes. Windows line endings ruined a good few days for a couple of people. Again.

I always write std::ios::binary flags on my archive files, but in this particular case I hadn't spotted they were missing.

Adding them will fix things:

void writeBinaryStrategy(std::string const& fname, NodeMap const& nodeMap) {
    std::ofstream f(fname, std::ios::binary);
    boost::archive::binary_oarchive archive(f);
    archive << nodeMap;
}

NodeMap readBinaryStrategy(std::string const& fname) {
    NodeMap nodeMap;
    std::ifstream ifs(fname, std::ios::binary);

    boost::archive::binary_iarchive ia(ifs);
    ia >> nodeMap;

    return nodeMap;
}

In the process of delving into the details, I came up with a rigorous roundtrip tester application. The code is ~unlive~ on Coliru, so I put up a Gist with the files:


  • File test.cpp

    #include <iostream>
    #include <boost/archive/binary_oarchive.hpp>
    #include <boost/archive/binary_iarchive.hpp>
    #include <boost/serialization/serialization.hpp>
    #include <boost/serialization/unordered_map.hpp>
    #include <boost/serialization/map.hpp>
    #include <boost/serialization/vector.hpp>
    #include <boost/serialization/array.hpp>
    #include <boost/functional.hpp>
    #include <boost/container_hash/hash.hpp>
    
    using boost::hash_value;
    
    struct Card {
        int rank, suit;
    
        Card(int rank = 0, int suit = 0) : rank(rank), suit(suit) {}
    
        template<class Archive>
        void serialize(Archive& ar, unsigned /*version*/) {
            ar & rank;
            ar & suit;
        }
    
        friend size_t hash_value(Card const& c) {
            auto v = hash_value(c.rank);
            boost::hash_combine(v, hash_value(c.suit));
            return v;
        }
    
        auto tied() const { return std::tie(rank, suit); }
        bool operator<(Card const& rhs) const { return tied() < rhs.tied(); }
        bool operator==(Card const& rhs) const { return tied() == rhs.tied(); }
    };
    
    struct DrawAction {
        bool fromDeck;
        Card card;
    
        explicit DrawAction(bool fromDeck = false, Card card = {})
                : fromDeck(fromDeck), card(card) {}
    
        template<class Archive>
        void serialize(Archive& ar, unsigned /*version*/) {
            ar & fromDeck;
            ar & card;
        }
    
        friend size_t hash_value(DrawAction const& da) {
            auto v = hash_value(da.fromDeck);
            boost::hash_combine(v, hash_value(da.card));
            return v;
        }
    
        auto tied() const { return std::tie(fromDeck, card); }
        bool operator<(DrawAction const& rhs) const { return tied() < rhs.tied(); }
        bool operator==(DrawAction const& rhs) const { return tied() == rhs.tied(); }
    };
    
    using Cards = std::vector<Card>;
    using Hand  = std::array<Card, 10>;
    using Draws = std::vector<DrawAction>;
    
    class InfosetHistory {
    public:
        Card initialDiscardPile;
        Hand initialHand;
        Draws playerDrawActions;
        Cards playerDiscardActions;
        Draws opponentDrawActions;
        Cards opponentDiscardActions;
    
        InfosetHistory(
                Card initialDiscardPile = {},
                Hand hand = {},
                Draws playerDrawActions = {},
                Cards playerDiscardActions = {},
                Draws opponentDrawActions = {},
                Cards opponentDiscardActions = {}
        ) : initialDiscardPile(initialDiscardPile),
            initialHand(std::move(hand)),
            playerDrawActions(std::move(playerDrawActions)),
            playerDiscardActions(std::move(playerDiscardActions)),
            opponentDrawActions(std::move(opponentDrawActions)),
            opponentDiscardActions(std::move(opponentDiscardActions)) {}
    
        template<class Archive>
        void serialize(Archive& ar, const unsigned int /*version*/) {
            ar & initialDiscardPile & initialHand
            & playerDrawActions & playerDiscardActions
            & opponentDrawActions & opponentDiscardActions;
        }
    
        friend size_t hash_value(InfosetHistory const& ish) {
            auto v = hash_value(ish.initialDiscardPile);
    
            auto combine = [&v](auto& range) { boost::hash_range(v, begin(range), end(range)); };
            combine(ish.initialHand);
            combine(ish.playerDrawActions);
            combine(ish.playerDiscardActions);
            combine(ish.opponentDrawActions);
            combine(ish.opponentDiscardActions);
            return v;
        }
    
        auto tied() const { return std::tie(initialDiscardPile, initialHand,
                playerDrawActions, playerDiscardActions, opponentDrawActions,
                opponentDiscardActions); }
    
        bool operator<(InfosetHistory const& rhs) const { return tied() < rhs.tied(); }
        bool operator==(InfosetHistory const& rhs) const { return tied() == rhs.tied(); }
    };
    
    class Node {
    public:
        Cards allowedActions;
        unsigned int NUM_ACTIONS{};
        std::vector<double> regretSum;
        std::vector<double> strategySum;
        unsigned char phase{};
    
        explicit Node(std::vector<Card> allowedActions = {},
                      unsigned int NUM_ACTIONS = 0,
                      std::vector<double> regretSum = {},
                      std::vector<double> strategySum = {},
                      unsigned char phase = 0
        ) : allowedActions(std::move(allowedActions)),
            NUM_ACTIONS(NUM_ACTIONS),
            regretSum(std::move(regretSum)),
            strategySum(std::move(strategySum)),
            phase(phase) {}
    
        template<class Archive>
        void serialize(Archive& ar, unsigned /*version*/) {
            ar & allowedActions
            & NUM_ACTIONS
            & regretSum & strategySum & phase;
        }
    
        auto tied() const { return std::tie(allowedActions, NUM_ACTIONS, regretSum, strategySum, phase); }
        bool operator<(Node const& rhs) const { return tied() < rhs.tied(); }
        bool operator==(Node const& rhs) const { return tied() == rhs.tied(); }
    };
    
    #include <map>
    #include <fstream>
    
    #if defined(ORDERED_MAP)
        template<typename K, typename V>
        using htable = std::map<K, V>;
    
        template <typename K, typename V>
        static inline bool check_insert(htable<K, V>& t, K k, V v) {
            return t.emplace(std::move(k), std::move(v)).second;
        }
    #elif defined(UNORDERED_MAP)
        template<typename K, typename V>
        using htable = std::unordered_map<K, V, boost::hash<K> >;
    
        template <typename K, typename V>
        static inline bool check_insert(htable<K, V>& t, K k, V v) {
            return t.emplace(std::move(k), std::move(v)).second;
        }
    #elif defined(INPUT_PRESERVING) // retain exact input order
        #include <boost/multi_index_container.hpp>
        #include <boost/multi_index/sequenced_index.hpp>
        #include <boost/multi_index/hashed_index.hpp>
        #include <boost/multi_index/member.hpp>
    
        namespace bmi = boost::multi_index;
    
        template<typename K, typename V, typename P = std::pair<K, V> >
        using htable = bmi::multi_index_container<
            P,
            bmi::indexed_by<
                bmi::sequenced<>,
                bmi::hashed_unique<bmi::member<P, K, &P::first>, boost::hash<K> >
            >
        >;
        template <typename K, typename V>
        static inline bool check_insert(htable<K, V>& t, K k, V v) {
            return t.insert(t.end(), std::make_pair(std::move(k), std::move(v))).second;
        }
    #endif
    
    using NodeMap = htable<Hand, htable<InfosetHistory, Node>>;
    
    NodeMap readTextStrategy(std::string const& fname);
    NodeMap readBinaryStrategy(std::string const& fname);
    
    void writeTextStrategy(std::string const& fname, NodeMap const& nodeMap);
    void writeBinaryStrategy(std::string const& fname, NodeMap const& nodeMap);
    
    int main() {
        auto const original = readTextStrategy("StrategyWritten0.txt");
    
        NodeMap bin = original, txt;
    
        for (int i = 1; i<5; ++i) {
            auto const fname = "roundtrip" + std::to_string(i);
    
            writeBinaryStrategy(fname + ".bin", bin);
            writeTextStrategy(fname + ".txt", bin);
            bin = readBinaryStrategy(fname + ".bin");
            txt = readTextStrategy(fname + ".txt");
    
            std::cout << "text roundtrip " << i << " is " << (txt == original?"equal":"different") << "\n";
            std::cout << "bin  roundtrip " << i << " is " << (bin == original?"equal":"different") << "\n";
        }
    }
    
    void writeBinaryStrategy(std::string const& fname, NodeMap const& nodeMap) {
        std::ofstream f(fname, std::ios::binary);
        boost::archive::binary_oarchive archive(f);
        archive << nodeMap;
    }
    
    NodeMap readBinaryStrategy(std::string const& fname) {
        NodeMap nodeMap;
        std::ifstream ifs(fname, std::ios::binary);
    
        boost::archive::binary_iarchive ia(ifs);
        ia >> nodeMap;
    
        return nodeMap;
    }
    
    #include <iomanip>
    #include <boost/lexical_cast.hpp> // full precision see https://stackoverflow.com/a/48085309/85371
    namespace TextSerialization {
    #if defined(__MINGW32__) || defined(WIN32)
        static constexpr char const* CRLF = "\n";
    #else
        static constexpr char const* CRLF = "\r\n";
    #endif
    
    
        static inline std::ostream& operator<<(std::ostream& os, Card const& c) {
            return os << c.rank << CRLF << c.suit;
        }
    
        static inline std::ostream& operator<<(std::ostream& os, DrawAction const& da) {
            return os << da.fromDeck << CRLF << da.card;
        }
    
        template <typename T, size_t N>
        static inline std::ostream& operator<<(std::ostream& os, std::array<T, N> const& from) {
            auto n = N;
            for (auto& el : from)
                os << el << (--n?CRLF:"");
            return os;
        }
    
        template <typename... T>
        static inline std::ostream& operator<<(std::ostream& os, std::vector<T...> const& from) {
            os << from.size();
            for (auto& el : from)
                os << CRLF << el;
            return os;
        }
    
        template <typename K, typename V>
        static inline std::ostream& operator<<(std::ostream& os, htable<K, V> const& from) {
            auto n = from.size();
            os << n;
            for (auto& [k,v] : from)
                os << CRLF << k << CRLF << v;
            return os;
        }
    
        static inline std::ostream& operator<<(std::ostream& os, InfosetHistory const& ish) {
            return os
                << ish.initialHand         << CRLF << ish.initialDiscardPile << CRLF
                << ish.playerDrawActions   << CRLF << ish.playerDiscardActions << CRLF
                << ish.opponentDrawActions << CRLF << ish.opponentDiscardActions;
        }
    
        static inline std::ostream& operator<<(std::ostream& os, Node const& n) {
            assert(n.NUM_ACTIONS == n.regretSum.size());
            assert(n.NUM_ACTIONS == n.strategySum.size());
    
            os << n.allowedActions << CRLF
               << n.NUM_ACTIONS << CRLF;
            for (auto& v: {n.regretSum, n.strategySum})
                for (auto& el: v)
                    os << boost::lexical_cast<std::string>(el) << CRLF;
            return os << n.phase;
        }
    }
    
    namespace TextDeserialization {
        template <typename Cont>
        static inline void read_n(std::istream& is, size_t n, Cont& into) {
            while (n--)
                is >> *into.emplace(into.end());
        }
    
        static inline std::istream& operator>>(std::istream& is, Card& c) {
            return is >> c.rank >> c.suit;
        }
    
        static inline std::istream& operator>>(std::istream& is, DrawAction& da) {
            return is >> da.fromDeck >> da.card;
        }
    
        template <typename T, size_t N>
        static inline std::istream& operator>>(std::istream& is, std::array<T, N>& into) {
            for (auto& el : into)
                is >> el;
            return is;
        }
    
        template <typename... T>
        static inline std::istream& operator>>(std::istream& is, std::vector<T...>& into) {
            size_t n;
            is >> n;
            read_n(is, n, into);
            return is;
        }
    
        template <typename K, typename V>
        static inline std::istream& operator>>(std::istream& is, htable<K, V>& into) {
            size_t n;
            is >> n;
            K k; V v;
            while (n--) {
                if (is >> k >> v && !check_insert(into, std::move(k), std::move(v)))
                    throw std::range_error("duplicate key");
            }
            return is;
        }
    
        static inline std::istream& operator>>(std::istream& is, InfosetHistory& ish) {
            return is
                >> ish.initialHand >> ish.initialDiscardPile
                >> ish.playerDrawActions >> ish.playerDiscardActions
                >> ish.opponentDrawActions >> ish.opponentDiscardActions;
        }
    
        static inline std::istream& operator>>(std::istream& is, Node& n) {
            is >> n.allowedActions;
            is >> n.NUM_ACTIONS;
            read_n(is, n.NUM_ACTIONS, n.regretSum);
            read_n(is, n.NUM_ACTIONS, n.strategySum);
            return is >> n.phase;
        }
    }
    
    void writeTextStrategy(std::string const& fname, NodeMap const& nodeMap) {
        using namespace TextSerialization;
        std::ofstream os(fname);
        os << nodeMap << CRLF;
    }
    
    NodeMap readTextStrategy(std::string const& fname) {
        using namespace TextDeserialization;
        std::ifstream is(fname);
    
        NodeMap nodeMap;
        is >> nodeMap;
    
        return nodeMap;
    }
    
  • File CMakeLists.txt

    PROJECT(work)
    CMAKE_MINIMUM_REQUIRED(VERSION 3.16)
    
    SET(CMAKE_CXX_STANDARD 17)
    SET(CMAKE_CXX_FLAGS "-g -O0 -L .")
    
    LINK_DIRECTORIES("C:\\boost\\lib")
    LINK_LIBRARIES(boost_serialization-mgw8-mt-d-x64-1_73)
    INCLUDE_DIRECTORIES("C:\\boost\\include\\boost-1_73")
    
    ADD_EXECUTABLE(ordered test.cpp)
    TARGET_COMPILE_DEFINITIONS(ordered PRIVATE "-DORDERED_MAP")
    
    ADD_EXECUTABLE(unordered test.cpp)
    TARGET_COMPILE_DEFINITIONS(unordered PRIVATE "-DUNORDERED_MAP")
    
    ADD_EXECUTABLE(multi-index test.cpp)
    TARGET_COMPILE_DEFINITIONS(multi-index PRIVATE "-DINPUT_PRESERVING")
    

Build:

mkdir -p build && (cd build; cmake ..; cmake --build .)

When tested with the input-order preserving datastructure (multi-index, with -DINPUT_PRESERVING):

./build/multi-index.exe
md5sum roundtrip*.txt StrategyWritten0.txt

Prints

text roundtrip 1 is equal
bin  roundtrip 1 is equal
text roundtrip 2 is equal
bin  roundtrip 2 is equal
text roundtrip 3 is equal
bin  roundtrip 3 is equal
text roundtrip 4 is equal
bin  roundtrip 4 is equal
d62dd8fe217595f2e069eabf54de479a *roundtrip1.txt
d62dd8fe217595f2e069eabf54de479a *roundtrip2.txt
d62dd8fe217595f2e069eabf54de479a *roundtrip3.txt
d62dd8fe217595f2e069eabf54de479a *roundtrip4.txt
d62dd8fe217595f2e069eabf54de479a *StrategyWritten0.txt

Original Answer

I serialize nodeMap as follows:

std::ofstream file("Strategy" + std::to_string(iteration) + ".bin");
boost::archive::binary_oarchive archive(file);
archive << nodeMap;
file.close();

That's risking incomplete archive, which could be your problem. You close the file before the archive destructor ran. Better:

I serialize nodeMap as follows:

{ 
    std::ofstream file("Strategy" + std::to_string(iteration) + ".bin");
    boost::archive::binary_oarchive archive(file);
    archive << nodeMap;
}

That way, the destructor of the archive runs before that of the file (as a bonus you don't have to manually close it).

(Destructors run in opposite order of construction).

Other things to keep in mind

  • you don't show the type serialized. Be very sure you deserialize the exact same thing as you serialize (see e.g. Error using boost serialization with binary archive)
  • be sure the platforms are compatible. Boost's binary archives are not portable. This means that if the target platform is not the same (endianness, integer sizes etc.) you will have Undefined Behavior

Upvotes: 5

Related Questions