James Ko
James Ko

Reputation: 34519

How to avoid copy-constructing a return value

I'm a newcomer to C++ and I ran into a problem recently returning a reference to a local variable. I solved it by changing the return value from std::string& to an std::string. However, to my understanding this can be very inefficient. Consider the following code:

string hello()
{
    string result = "hello";
    return result;
}

int main()
{
    string greeting = hello();
}

To my understanding, what happens is:

This probably doesn't matter that much for std::string, but it can definitely get expensive if you have, for example, a hash table with hundreds of entries.

How do you avoid copy-constructing a returned temporary, and instead return a copy of the pointer to the object (essentially, a copy of the local variable)?


Sidenote: I've heard that the compiler will sometimes perform return-value optimization to avoid calling the copy constructor, but I think it's best not to rely on compiler optimizations to make your code run efficiently.)

Upvotes: 9

Views: 6354

Answers (3)

Pete Baughman
Pete Baughman

Reputation: 3034

I disagree with the sentence "I think it's best not to rely on compiler optimizations to make your code run efficiently." That's basically the compiler's whole job. Your job is to write clear, correct, and maintainable source code. For every performance issue I've ever had to fix, I've had to fix a hundred or more issues caused by a developer trying to be clever instead of doing something simple, correct, and maintainable.

Let's take a look at some of the things you could do to try to "help" the compiler and see how they affect the maintainability of the source code.

  • You could return the data via reference

For example:

void hello(std::string& outString)

Returning data using a reference makes the code at the call-site hard to read. It's nearly impossible to tell what function calls mutate state as a side effect and which don't. Even if you're really careful with const qualifying the references it's going to be hard to read at the call site. Consider the following example:

void hello(std::string& outString); //<-This one could modify outString
void out(const std::string& toWrite); //<-This one definitely doesn't.

. . .

std::string myString;
hello(myString); //<-This one maybe mutates myString - hard to tell.
out(myString);   //<-This one certainly doesn't, but it looks identical to the one above

Even the declaration of hello isn't clear. Does it modify outString, or was the author just sloppy and forgot to const qualify the reference? Code that is written in a functional style is easier to read and understand and harder to accidentally break.

Avoid returning the data via reference

  • You could return a pointer to the object instead of returning the object.

Returning a pointer to the object makes it hard to be sure your code is even correct. Unless you use a unique_ptr you have to trust that anybody using your method is thorough and makes sure to delete the pointer when they're done with it, but that isn't very RAII. std::string is already a type of RAII wrapper for a char* that abstracts away the data lifetime issues associated with returning a pointer. Returning a pointer to a std::string just re-introduces the problems that std::string was designed to solve. Relying on a human being to be diligent and carefully read the documentation for your function and know when to delete the pointer and when not to delete the pointer is unlikely to have a positive outcome.

Avoid returning a pointer to the object instead of returning the object

  • That brings us to move constructors.

A move constructor will just transfer ownership of the pointed-to data from 'result' to its final destination. Afterwards, accessing the 'result' object is invalid but that doesn't matter - your method ended and the 'result' object went out of scope. No copy, just a transfer of ownership of the pointer with clear semantics.

Normally the compiler will call the move constructor for you. If you're really paranoid (or have specific knowledge that the compiler isn't going to help you) you can use std::move.

Use move constructors if at all possible

Finally modern compilers are amazing. With a modern C++ compiler, 99% of the time the compiler is going to do some sort of optimization to eliminate the copy. The other 1% of the time it's probably not going to matter for performance. In specific circumstances the compiler can re-write a method like std::string GetString(); to void GetString(std::string& outVar); automatically. The code is still easy to read, but in the final assembly you get all of the real or imagined speed benefits of returning by reference. Don't sacrifice readability and maintainability for performance unless you have specific knowledge that the solution doesn't meet your business requirements.

Upvotes: 6

AnT stands with Russia
AnT stands with Russia

Reputation: 320531

The description in your question is pretty much correct. But it is important to understand that this is behavior of the abstract C++ machine. In fact, the canonical description of abstract return behavior is even less optimal

  1. result is copied into a nameless intermediate temporary object of type std::string. That temporary persists after the function's return.
  2. That nameless intermediate temporary object is then copied to greeting after function returns.

Most compilers have always been smart enough to eliminate that intermediate temporary in full accordance with the classic copy elision rules. But even without that intermediate temporary the behavior has always been seen as grossly suboptimal. Which is why a lot of freedom was given to compilers in order to provide them with optimization opportunities in return-by-value contexts. Originally it was Return Value Optimization (RVO). Later Named Return Value Optimization was added to it (NRVO). And finally, in C++11, move semantics became an additional way to optimize the return behavior in such cases.

Note that under NRVO in your example the initialization of result with "hello" actually places that "hello" directly into greeting from the very beginning.

So in modern C++ the best advice is: leave it as is and don't avoid it. Return it by value. (And prefer to use immediate initialization at the point of declaration whenever you can, instead of opting for default initialization followed by assignment.)

Firstly, the compiler's RVO/NRVO capabilities can (and will) eliminate the copying. In any self-respecting compiler RVO/NRVO is not something obscure or secondary. It is something compiler writers do actively strive to implement and implement properly.

Secondly, there's always move semantics as a fallback solution if RVO/NRVO somehow fails or is not applicable. Moving is naturally applicable in return-by-value contexts and it is much less expensive than full-blown copying for non-trivial objects. And std::string is a movable type.

Upvotes: 14

Ari0nhh
Ari0nhh

Reputation: 5920

There are plenty of ways to achieve that:

1) Return some data by the reference

void SomeFunc(std::string& sResult)
{
  sResult = "Hello world!";
}

2) Return pointer to the object

CSomeHugeClass* SomeFunc()
{
  CSomeHugeClass* pPtr = new CSomeHugeClass();
  //...
  return(pPtr);
}

3) C++ 11 could utilize a move constructor in such cases. See this this and this for the additional info.

Upvotes: 3

Related Questions