Sebastian D'Agostino
Sebastian D'Agostino

Reputation: 1675

How to optimize heavy map insertion in C++ regarding CPU and memory

I am iterating a map where I need to add elements on that map depending on a condition that an element is not found (it could be any other condition).

My main problem is that with a big scale of updates to be added, the application takes the whole CPU and all the memory.

State Class:

class State {

    int id;

    int timeStamp;

    int state;

}

Method in State:

void State::updateStateIfTimeStampIsHigher(const State& state) {
    if (this->id == state.getId() && state.getTimeStamp() > this->getTimeStamp()) {
        this->timeStamp = state.getTimeStamp();
        this->state = state.getState();
    }
}

Loop Code:

std::map<int, State> data;

const std::map<int, State>& update;

for (auto const& updatePos : update) {
    if (updatePos.first != this->toNodeId) {
        std::map<int, State>::iterator message = data.find(updatePos.first);
        if (message != data.end() && message->first) {
            message->second.updateStateIfTimeStampIsHigher(updatePos.second);
        } else {
            data.insert(std::make_pair(updatePos.first, updatePos.second));
        }
    }
}

Watching KCacheGrind data it looks like the data.insert() line takes most time / memory. I am new to KCacheGrind, but this line seemed to be around 72% of the cost.

Do you have any suggestions on how to improve this?

Upvotes: 1

Views: 1668

Answers (2)

Sebastian D&#39;Agostino
Sebastian D&#39;Agostino

Reputation: 1675

I could find a satisfying answer thanks to researching the comments to the question. It did help a little bit to change from map to unordered_map but I still got unsatisfying results.

I ended up using Google's sparsehash that provides a better resource usage despite some drawbacks from erasing entries (which I do).

The code solution is as follows. First I include the required library:

#include <sparsehash/sparse_hash_map>

Then, my new data definition looks like:

struct eqint {
    bool operator()(int i1, int i2) const {
        return i1 == i2;
    }
};

google::sparse_hash_map<int, State, std::tr1::hash<int>, eqint> data;

Since I have to use "erase" I have to do this after the sparsemap construction:

data.clear_deleted_key();
data.set_deleted_key(-1);

Finally my loop code changes very little:

for (auto const& updatePos : update) {
    if (updatePos.first != this->toNodeId) {
        google::sparse_hash_map<int, State, std::tr1::hash<int>, eqint>::iterator msgIt = data.find(updatePos.first);
        if (msgIt != data.end() && msgIt->first) {
            msgIt->second.updateStateIfTimeStampIsHigher(updatePos.second);
        } else {
            data[updatePos.first] = updatePos.second;
        }
    }
}

The time before making the changes for a whole application run under specific parameters was:

real    0m28,592s
user    0m27,912s
sys     0m0,676s

And the time after making the changes for the whole application run under the same specific parameters is:

real    0m37,464s
user    0m37,032s
sys     0m0,428s

I run it with other cases and the results where similar (from a qualitative point of view). The system time and resourse usage (CPU and memory) decreases and the user time increases.

Overall I am satisfied with the tradeoff since I was more concerned about resource usage than execution time (the application is a simulator and it was not able to finish and get results under really heavy load and now it does).

Upvotes: 1

R2RT
R2RT

Reputation: 2146

Your question is quite general, but I see tho things to make it run faster:

  1. Use hinted insertion / emplacement. When you add new element its iterator is returned. Assuming that both maps are ordered in same fashion you can tell where was the last one inserted so lookup should be faster (could use some benchmarking here).
  2. Use emplace_hint for faster insertion

Sample code here:

std::map<int, long> data;

const std::map<int, long> update;
auto recent = data.begin();

for (auto const& updatePos : update) {
    if (updateElemNotFound) { 
        recent = data.emplace_hint(recent, updatePos);
    }
}

Also, if you want to trade CPU over memory you could use unordered_map (Is there any advantage of using map over unordered_map in case of trivial keys?), but first dot would not matter anymore.

Upvotes: 1

Related Questions