Reputation: 5913
The Standard Template Library documentation for list says:
void push_back ( const T& x );
Add element at the end Adds a new element at the end of the list, right after its current last element. The content of this new element is initialized to a copy of x.
These semantics differ greatly from the Java semantics, and confuse me. Is there a design principle in the STL that I'm missing? "Copy data all the time"? That scares me. If I add a reference to an object, why is the object copied? Why isn't just the object passed across?
There must be a language design decision here, but most of the commentary I've found on Stack Overflow and other sites focuses on the exception throwing issues associated with the fact that all this object copying can throw exceptions. If you don't copy, and just handle references all those exception problems go away. Very confused.
Please note: in this legacy code base I work with, boost is not an option.
Upvotes: 10
Views: 9880
Reputation: 131
Without reference-counting there is no way to maintain shared-ownership, so single-ownership is maintained by copying.
Consider the common case where you want to add a stack-allocated object to a list that will outlive it:
void appendHuzzah(list<string> &strs) {
strs.push_back(string("huzzah!"));
}
The list can't keep the original object because that object will be destroyed when it falls out of scope. By copying, the list obtains its own object whose life-span is entirely under its own control. If it were otherwise, such straightforward usage would crash and be useless and we would always have to use lists of pointers.
Java distinguishes between primitive types and reference types. In C++ all types are primitive.
Upvotes: 2
Reputation: 36517
Java works the same way, actually. Allow me to explain:
Object obj = new Object();
List<Object> list = new LinkedList<Object>();
list.add(obj);
What is the type of obj
? It is a reference to an Object
. The actual object is floating around somewhere on the heap—the only thing you can do in Java is pass around references to it. You pass a reference to the object to the list's add
method, and the list stores a copy of that reference in itself. You can later modify the named reference obj
without affecting the separate copy of that reference stored in the list. (Of course, if you modify the object itself, you can see that change through either reference.)
C++ has more options. You can emulate Java:
class Object {};
// ...
Object* obj = new Object;
std::list<Object*> list;
list.push_back(obj);
What is the type of obj
? It is a pointer to an Object
. When you pass it to the list's push_back
method, the list stores a copy of that pointer in itself. This has the same semantics as Java.
But if you think about it from an efficiency standpoint… how big is a C++ pointer/Java reference? 4 bytes or 8 bytes, depending on your architecture. If the object that you care about is around that size or smaller, why would bother putting it on the heap and then passing pointers to it everywhere? Just pass the object:
class Object {};
// ...
Object obj;
std::list<Object> list;
list.push_back(obj);
Now, obj
is an actual object. You pass it to the list's push_back
method, which stores a copy of that object in itself. This is a C++ idiom, in a way. Not only does it make sense for small objects, where a pointer is pure overhead, it also makes thing easier in a non-GC language (there's nothing lying on a heap that might be accidentally leaked), and if the object's lifespan is naturally tied to the list (i.e. if it is removed from the list, then semantically it should no longer exist), then you might as well store the whole object on the list. It also has cache locality benefits (when used in an std::vector
, anyway).
You may ask, "Why does push_back
take a reference argument, then?" There's a simple enough reason for that. Every parameter is passed by value (again, in both C++ and Java). If you have a std::list
of Object*
, fine—you pass in your pointer, and a copy of that pointer is made and passed into the push_back
function. Then, inside that function, another copy of that pointer is made and stored into the container.
That's fine for a pointer. But in C++, copying objects can be arbitrarily complicated. A copy constructor can do anything. In some circumstances, copying an object twice (once into the function, and again into the container) can be a performance problem. So push_back
takes its argument by const reference—it makes a single copy, straight from the original object into the container.
Upvotes: 4
Reputation: 8831
The STL always stores exactly what you tell it to store. list<T>
is always a list of T
so everything will be stored by-value. If you want a list of pointers, use list<T*>
, that will be similar to the semantics in Java.
This might tempt you to try list<T&>
, but that is not possible. References in C++ have different semantics than references in Java. In C++, references must be initialized to point to an object. After a reference is initialized, it will always point to this object. You can never make it point to a different object. This makes it impossible to have a container of references in C++. Java references are more closely related to C++ pointers, so you should use list<T*>
, instead.
Upvotes: 12
Reputation: 258618
You don't add a reference to an object. You pass an object by reference. That's different. If you didn't pass by reference, an extra copy might have been made even before the actual insert.
And it does a copy because you need a copy, otherwise code like:
std::list<Obj> x;
{
Obj o;
x.insert(o);
}
would leave the list with an invalid object, because o
went out of scope. If you want something similar to Java, consider using shared_ptr
. This gives you the advantages you're used to in Java - automatic memory management, and lightweight copying.
Upvotes: 4
Reputation: 8027
It's called 'value semantics'. C++ is normally coded to copy values, unlike Java where, primitive types aside, you copy references. It might scare you but personally Java's reference semantics scares me more. But in C++ you have a choice, if you want reference semantics just use pointers (prefereably smart ones). Then you will be closer to the Java you are used to. But remember in C++ no garbage collection (which is why you should normally use smart pointers).
Upvotes: 6