Korchkidu
Korchkidu

Reputation: 4946

Most efficient way to assign values to a map of maps

Given a map of maps like:

  std::map<unsigned int, std::map<std::string, MyBase*>> m_allMyObjects;

What would be the most efficient way to insert/add/"emplace" an element into m_allMyObjects given an unsigned int and a std::string taking optimization into account (on modern compilers)?

What would be the most efficient way to retrieve an element then?

m_allMyObjects may potentially contain up to 100'000 elements in the future.

Upvotes: 0

Views: 130

Answers (1)

Michael Karcher
Michael Karcher

Reputation: 4021

Common knowledge and folklore about how to efficiently insert into maps (typically telling you to avoid operator[] and prefering the shiny new emplace) considers the costs of constructing and copying the values in the map. In your case, those values are plain pointers which can be copied at virtually no expense, and copying pointers can be aggressively optimized by the compiler.

On the other hand, you actually do have an object that is expensive to handle, namely the key of type std::string. You need to watch out for copies (moved or copied) of the key to determine performance. Obviously, for the tree lookup, you already need the string object, even if you have it as char*, as there is no insertion function that is templated over the type of key. This means for looking up the place to insert, you use one certain std::string object, but once the map node gets created, the new std::string object inside the map is copy-initialized from it (possibly moved). Avoiding everything in excess of that single copy/move should be your goal.

Example time!

#include <map>
#include <cstdio>

struct noisy {
  noisy(int v) : val(v) {}
  noisy(const noisy& src) : val(src.val) { std::puts("copy ctor"); }
  noisy(noisy&& src) : val(src.val)      { std::puts("move ctor"); }
  noisy& operator=(const noisy& src)
  { val = src.val; std::puts("copy assign"); return *this; }
  noisy& operator=(noisy&& src)
  { val = src.val; std::puts("move assign"); return *this; }
  int val;
};

bool operator<(const noisy& a, const noisy& b)
{
  return a.val < b.val;
}

int main(void)
{
  std::map<noisy,int> m;
  std::puts("Operator[]");
  m[noisy(1)] = 3;
  std::puts("insert/make_pair");
  m.insert(std::make_pair(noisy(2), 3));
  std::puts("insert/make_pair/ref");
  m.insert(std::make_pair<noisy&&,int>(noisy(3), 3));
  std::puts("insert/pair/ref");
  m.insert(std::pair<noisy&&,int>(noisy(4), 3));
  std::puts("emplace");
  m.emplace(noisy(5), 3);
}

compiled with g++ 4.9.1, -std=c++11, -O2, the result is

Operator[]
move ctor
insert/make_pair
move ctor
move ctor
insert/make_pair/ref
move ctor
move ctor
insert/pair/ref
move ctor
emplace
move ctor

Which shows: avoid everything that creates an intermediate pair containing a copy of the key! Be aware that std::make_pair does never create a pair that contains references, even if it can take the parameters by reference! Whenever you pass a pair containing the copy of the key, the key gets copied into the pair and later into the map.

The expression suggested by MarkMB, namely m[int_k][str_k] = ptr, is quite good, and likely produces optimal code. There is no reason for the first index (int_k) to not use [], as you want a default constructed sub-map if the index is not used yet, so there is no unnecessary overhead. As we have seen, indexing with the string gets away with a single copy, so you are fine. If you can afford to lose your string, m[int_k][std::move(str_k)] = ptr might be a win, though. As discussed in the beginning, using emplace instead of [] is only about the values, which are virtually free to handle in your case.

Upvotes: 3

Related Questions