Merge std::unordered_map iteratively

Question

I have a list of nodes that each decompose into more nodes. For example

Node0 = w01 * Node1 + w02 * Node2 + w03 * Node3
Node1 = w12 * Node2 + w14 * Node4

Therefore, we have Node0 = w01*w12 * Node2 + w03 * Node3 + w01*w14 Node4.

My C++ code for performing the above aggregation/decomposition/merging for a given set of weight decompositions looks as follows. However, I feel there are a lot of optimisations to be made. To name just one, I am looping over the keys of topWeights and collect them in topNodeNames, which seems terribly inefficient.

Are there any STL algorithms that could help me speed this up, and possibly avoid unnecessary copying?

#include 
#include 

template using umap = std::unordered_map;


umap getWeights(const std::string& nodeName, const umap>& weightTrees)
{
    const auto it = weightTrees.find(nodeName);
    if (it == weightTrees.end())
        return umap();

    umap topWeights = it->second;
    std::vector topNodeNames;

    for (const auto& kv : topWeights)
        topNodeNames.push_back(kv.first);

    for (const std::string& topNodeName : topNodeNames)
    {
        umap subWeights = getWeights(topNodeName, weightTrees);
        if (subWeights.size() > 0)
        {
            const double topWeight = topWeights[topNodeName];
            topWeights.erase(topNodeName);
            for (const auto& subWeight : subWeights)
            {
                const auto it = topWeights.find(subWeight.first);
                if (it == topWeights.end())
                    topWeights[subWeight.first] = topWeight * subWeight.second;
                else
                    it->second += topWeight * subWeight.second;
            }
        }
    }

    return topWeights;
}


int main()
{
    umap> weightTrees = {{ "Node0", {{ "Node1",0.5 },{ "Node2",0.3 },{ "Node3",0.2 }} },
                                                                { "Node1", {{ "Node2",0.1 },{ "Node4",0.9 }} }};

    umap w = getWeights("Node0", weightTrees); // gives {Node2: 0.35, Node3: 0.20, Node4: 0.45}
}

Max Langhof · Accepted Answer

The main problem is that you are recursing for every node to every subnode, which is generally highly redundant. One way to avoid this would be to introduce an order on the node names, where "higher" nodes depend only on "lower" nodes and then calculate them in reverse order (for each node you'll already know all child weights exactly). However, I don't think there are std algorithms that will find this order for you because you can't transiently determine node dependencies cheaply ("does node X depend on node Y? If it's not directly, we might have to search the entire tree...").

So, you could just go the dynamic programming route and store nodes that you have fully calculated somewhere. Or even better - you could just flatten the entire tree down to leaf-only weights as you traverse it. As long as you retain the flattening throughout the recursion, this is actually quite elegant in recursive form:

using NodeWeights = std::unordered_map;
using NonLeaves = std::unordered_map;

// Modifies the tree so that the given root has no non-leaf children.
void flattenTree(std::string root, NonLeaves& toFlatten)
{
    auto rootIt = toFlatten.find(root);
    if (rootIt == toFlatten.end())
        return;

    NodeWeights& rootWeights = rootIt->second;

    NodeWeights leafOnlyWeights;

    for (auto kvp : rootWeights)
    {
        const std::string& childRoot = kvp.first;
        double childWeight = kvp.second;

        std::cout << "Checking child " << childRoot << std::endl;

        // If the graph is indeed acyclic, then the root kvp here is untouched
        // by this call (and thus references to it are not invalidated).
        flattenTree(childRoot, toFlatten);

        auto childIt = toFlatten.find(childRoot);

        // The child is a leaf after flattening: Do not modify anything.
        if (childIt == toFlatten.end())
        {
            leafOnlyWeights[childRoot] = childWeight;
            continue;
        }

        // Child is still not a leaf (but all its children are now leaves):
        // Redistribute its weight among our other child weights.
        const NodeWeights& leafWeights = childIt->second;
        for (auto leafKvp : leafWeights)
            leafOnlyWeights[leafKvp.first] += childWeight * leafKvp.second;
    }

    rootWeights = leafOnlyWeights;
}

int main()
{
    umap> weightTrees = {{ "Node0", {{ "Node1",0.5 },{ "Node2",0.3 },{ "Node3",0.2 }} },
                                                                { "Node1", {{ "Node2",0.1 },{ "Node4",0.9 }} }};

    auto flattenedTree = weightTrees;
    flattenTree("Node0", flattenedTree);

    umap w = flattenedTree["Node0"]; // Should give {Node2: 0.35, Node3: 0.20, Node4: 0.45}

    for (auto kvp : w)
      std::cout << kvp.first << ": " << kvp.second << std::endl;
}

Demo

Since each node is flattened at most once, you cannot run into the exponential runtime your original algorithm has.

Merge std::unordered_map iteratively

Answers (2)

Related Questions