RussoTuristo
RussoTuristo

Reputation: 18

Do constructors reside in objects?

I'm now reading the "The C++ Programming Language, 4th edition" of Stroustrup and there are such lines in the text:

"I use the standard-library memcpy() to copy the bytes of the source into the target. That’s a low-level and sometimes pretty nasty function. It should be used only where there are no objects with constructors or destructors in the copied memory because memcpy() knows nothing about types."

As I understood by now, a class object is a contigious block of memory that contains ONLY the class representation, that is, only data members of its class. All member functions (consrtuctors, destructor, assignments, etc.) seemed to me to be stored "elsewhere", not in this class object. But after these words it seems to me like constructors are stored in class objects too, that is, there are some bytes in the class object's block of memory that represent constructor of the corresponding class. Is it really so? Am I wrong in my vision of what class object's block of memory is?

Upvotes: 0

Views: 248

Answers (1)

Peter
Peter

Reputation: 36597

No, constructors do not reside in objects. They are special functions called during construction of an object, and their role is (over-simplistically) to turn a chunk of raw memory into an object (aka an instance of the class).

The problem is that constructors can do non-trivial things, such as

  • The object contains a pointer member, and the constructors initialise that pointer so it points at dynamically allocated memory;
  • The object manages a thread of execution (like std::thread since C++11), and the constructors initialise that thread;
  • The object manages a mutex, and the constructors initialise that mutex;

In all these cases, each object contains some sort of handle, that represents some resource that exists outside the object itself. The constructor initialises that handle so it refers to a particular resource (e.g. a chunk of dynamically allocated memory). Member functions assume, when they are called, that the resource has been properly allocated by a constructor, and manipulate that resource (e.g. store data in the dynamically allocated memory, suspend a thread, lock or unlock a mutex). When the object ceases to exist, the destructor assumes the resources have been properly allocated and used, and then releases that resource. Once the destructor completes, the managed resource no longer exists - any attempt to access it causes undefined behaviour.

From a design perspective;

  • The constructor establishes a set of invariants (a set of conditions that all member functions, including the destructor, can assume to always be true);
  • Member functions other than the destructor maintain those invariants (i.e. they may change things temporarily, but before they return, they ensure the invariants are still in effect). This allows - among other things - member functions to be called one after the other;
  • The destructor is eventually called as part of the process of an object being destroyed. It relies on the invariants established by constructors still being valid, and releases the resources.

Now, imagine what happens if - after construction - memcpy() is used to copy the data from an object to another object. This does not involve the constructor. But it copies the handle (the pointer to dynamically allocated memory, etc) by value. The net effect is that two distinct objects end up having handles to the same set of resources. When member functions of either object are called, operations on both objects affect the resource used by the other.

As long as both objects exist, this can cause surprises for developers (or, worse, end users) because changing one object has the effect of magically changing the other. The two objects interact in ways that are not intended. (And, if such things ARE intended, there are often unwanted side-effects that are difficult to control).

Things get even worse when either of the objects ceases to exist.

Consider what happens if one of the objects ceases to exist, but the other lives on and is used (e.g. member functions called). When the destructor is called for the first destroyed object, it happily releases the managed resource. That's its job. However, member functions of the surviving object happily assume that the resource still exists, have no way (within standard C++) to detect that the resource no longer exists, and continue to manipulate it. That invariably causes undefined behaviour - for example, manipulating data in dynamically allocated memory that has been released, and no longer exists as far as the program is concerned.

Eventually, the second object will also cease to exist, and its destructor will release its resource. But that resource has already been released (by the destructor when called for the first object) and releasing it again causes undefined behaviour. Like any other member function, the destructor cannot detect that the resource has already been released, so cannot prevent this.

Additionally, memcpy() is unsafe for copying objects that contain or inherit from other objects that, in turn, manage a resource. For example, an object that has a member of type std::string cannot be safely copied with memcpy(), since the std::string dynamically manages memory. Similarly, a class with a base class that, in turn, has a std::string member cannot be copied using memcpy().

The list of circumstances where copying using memcpy() is unsafe goes on and on. The short description that "memcpy() knows nothing about types" is a crude summary of this. memcpy() simply copies the memory occupied by the object, and that doesn't clone any resources (outside those objects) that those objects refer to.

There are some types (e.g. POD types or - in more recent C++ standards - trivial types) in which objects can be safely copied with memcpy(). But, if an object manages a resource outside itself, it will generally not be safe to copy it using memcpy()).

Upvotes: 2

Related Questions