Lars Kakavandi-Nielsen
Lars Kakavandi-Nielsen

Reputation: 2198

Get sub-map from std::map by number of elements instead of key using iterator

I have a std::map<std::string, std::vector<std::string>> and I need to perform a threaded task on this map by dividing the map into sub-maps and passing each sub-map to a thread.

With a std::vector<T> I would be able to get a sub-vector pretty easy, by doing this:

#include <vector>
#include <string>

int main(void)
{
    size_t off = 0; 
    size_t num_elms = 100; // Made up value 
    std::vector<uint8_t> full; // Assume filled with stuff
    std::vector<uin8t_t> sub(std::begin(full) + off, std::begin(full) + off + num_elms);
    off = off + num_elms;
}

However, doing the same with std::map<T1, T2> gives a compilation error.

#include <vector>
#include <map>
#include <string>

int main(void)
{
    size_t off = 0; 
    size_t num_elms = 100; 
    
    std::map<std::string, std::vector<std::string>> full; 
    std::map<std::string, std::vector<std::string>> sub(std::begin(full) + off, 
                                                        std::begin(full) + off + num_elms); 
    off = off + num_elms;
}

It is the same with other std::map "types". Which, from what I have gathered, is down to the iterator.

What is possible is to extract the keys and do something similar to this solution:

#include <map>
#include <vector>
#include <string>

#include <iostream>

void print_map(const std::map<std::string, std::vector<std::string>>& _map)
{
    for (const auto& [key, value] : _map)
    {
        std::cout << "key: " << key << "\nvalues\n";
        for (const auto& elm : value)
        {
            std::cout << "\t" << elm << "\n"; 
        }
    }
}

void print_keys(const std::vector<std::string>& keys)
{
    std::cout << "keys: \n"; 
    for(const auto& key : keys)
    {
        std::cout << key << "\n"; 
    }
}

int main(void)
{
    std::map<std::string, std::vector<std::string>> full;

    full["aa"] = {"aa", "aaaa", "aabb"};
    full["bb"] = {"bb", "bbbbb", "bbaa"};
    full["cc"] = {"cc", "cccc", "ccbb"};
    full["dd"] = {"dd", "dd", "ddcc"};

    print_map(full);

    std::vector<std::string> keys;

    for (const auto& [key, value] : full)
    {
        (void) value;
        keys.emplace_back(key); 
    }

    print_keys(keys); 

    size_t off = 0;
    size_t num_elms = 2;
    
    
    std::map<std::string, std::vector<std::string>> sub1 (full.find(keys.at(off)), full.find(keys.at(off + num_elms)));
    off = off + num_elms; 
    std::map<std::string, std::vector<std::string>> sub2 (full.find(keys.at(off)), full.find(keys.at(off + num_elms -1)));

    std::cout << "sub1:\n";
    print_map(sub1);
    std::cout << "sub2:\n";
    print_map(sub2);     
}

However, this has the potential to be extremely inefficient, as the map can be really big (10k+ elements).

So, is there a better way to replicate the std::vector approach with std::map?

Upvotes: 2

Views: 363

Answers (2)

Ted Lyngmo
Ted Lyngmo

Reputation: 118097

A slightly different approach would be to use one of the execution policies added in C++17, like std::execution::parallel_policy. In the example below, the instance std::execution::par is used:

#include <execution>

    // ...

    std::for_each(std::execution::par, full.begin(), full.end(), [](auto& p) {
        // Here you are likely using a thread from a built-in thread pool
        auto& vec = p.second;
        // do work with "vec"
    });

Upvotes: 3

Caleth
Caleth

Reputation: 63392

With a slight adaption, you can reasonably easily pass ranges to print_map, and divide up your map by calling std::next on an iterator.

// Minimal range-for support
template <typename Iter>
struct Range {
    Range (Iter b, Iter e) : b(b), e(e) {}
    Iter b;
    Iter e;

    Iter begin() const { return b; }
    Iter end() const { return e; }
};

// some shorter aliases
using Map = std::map<std::string, std::vector<std::string>>;
using MapView = Range<Map::const_iterator>;

// not necessarily the whole map
void print_map(MapView map) {
    for (const auto& [key, value] : map)
    {
        std::cout << "key: " << key << "\nvalues\n";
        for (const auto& elm : value)
        {
            std::cout << "\t" << elm << "\n"; 
        }
    }
}

int main(void)
{
    Map full;

    full["aa"] = {"aa", "aaaa", "aabb"};
    full["bb"] = {"bb", "bbbbb", "bbaa"};
    full["cc"] = {"cc", "cccc", "ccbb"};
    full["dd"] = {"dd", "dd", "ddcc"};

    // can still print the whole map
    print_map({ map.begin(), map.end() });

    size_t num_elms = 2;
    size_t num_full_views = full.size() / num_elms;
    
    std::vector<MapView> views;

    auto it = full.begin();
    for (size_t i = 0; i < num_full_views; ++i) {
        auto next = std::next(it, num_elms);
        views.emplace_back(it, next);
        it = next;
    }

    if (it != full.end()) {
        views.emplace_back(it, full.end());
    }

    for (auto view : views) {
        print_map(view);
    }
}

In C++20 (or with another ranges library), this can be simplified with std::ranges::drop_view / std::ranges::take_view.

using MapView = decltype(std::declval<Map>() | std::ranges::views::drop(0) | std::ranges::views::take(0));


for (size_t i = 0; i < map.size(); i += num_elms) {
    views.push_back(map | std::ranges::views::drop(i) | std::ranges::views::take(num_elms));
}

Upvotes: 1

Related Questions