User
User

Reputation: 610

Compiler optimizing return by reference vs. by value

I wanted to understand the specific optimization if any which can be performed by the compiler here .

The two get functions get() return a string by reference and another by value. Map is global in nature.

I want to understand this as it common thing is done, so what will happen in case we have a map of objects or map of maps or very big strings, (not limiting my map to strings only ). How costly it can get. I can understand most of us know that it's compiler-dependent, but can we have a list of unknowns. It will really help

std::map<std::string, std::string> value;

std::string& get(std::string& key) {
    return value[key];
}

std::string get2(std::string& key) {
    return value[key];
}

int main()
{ 
    value.insert(std::make_pair("name","XXXXXXX"));
    std::string keyaa = "name";
    auto new_val = get(keyaa);
    auto new_val2 = get2(keyaa);
}

Upvotes: 1

Views: 184

Answers (1)

rustyx
rustyx

Reputation: 85541

From the C++ language perspective there is no guarantee that the copy in get2 will be elided. Mandatory return value optimization covers only prvalue operands (i.e. values created in the function call itself). Even the "permitted" optimization doesn't cover pre-existing objects.

So we can only hope that compilers today are smart enough to optimize away the copy, which means we have to test it!

I've rewritten the example slightly to make it maximally easy for the compiler to optimize away the string:

#include <string>
#include <map>

struct Test {
    std::map<std::string, std::string> value = {{"name", "XXXXXX"}};

    std::string const& get(std::string const& key) {
        return value[key];
    }

    std::string get2(std::string const& key) {
        return value[key];
    }
};

static void TestReturnByReference(benchmark::State& state) {
  Test test;
  std::string key = "name";
  for (auto _ : state) {
    size_t n = test.get(key).size();
    benchmark::DoNotOptimize(n);
  }
}

BENCHMARK(TestReturnByReference);

static void TestReturnByValue(benchmark::State& state) {
  Test test;
  std::string key = "name";
  for (auto _ : state) {
    size_t n = test.get2(key).size();
    benchmark::DoNotOptimize(n);
  }
}

BENCHMARK(TestReturnByValue);

And no, as it turns out nether GCC nor Clang are able to optimize it away entirely:

GCC 10.2, -O3: (link to quick-bench) - noticeable difference:

enter image description here

Clang 11 (libc++), -O3: - better, but still slower:

enter image description here

Conclusion: returning an existing string is faster by reference.


Note: starting from C++17, you can return std::string_view to avoid worrying about this:

std::string_view get(std::string const& key) {
    return value[key];
}

Upvotes: 1

Related Questions