Reputation: 179
I read in many places including Effective C++ that it is better to store data on the stack and not as pointer to the data.
I can understand doing this with small object, because the number of new and delete calls is also reduced, which reduces the chance of a memory leak. Also, the pointer can take more space than the object itself.
But with large object, where copying them will be expensive, is it not better to store them in a smart pointer?
Because with many operations with the large object there will be few object copying which is very expensive (I am not including the getters and setters).
Upvotes: 4
Views: 143
Reputation:
Let's focus purely on efficiency. There's no one-size-fits-all, unfortunately. It depends on what you are optimizing for. There's a saying, always optimize the common case. But what is the common case? Sometimes the answer lies in understanding your software's design inside out. Sometimes it's unknowable even at the high level in advance because your users will discover new ways to use it that you didn't anticipate. Sometimes you will extend the design and reveal new common cases. So optimization, but especially micro-optimization, is almost always best applied in hindsight, based on both this user-end knowledge and with a profiler in your hand.
The few times you can usually have really good foresight about the common case is when your design is forcing it rather than responding to it. For example, if you are designing a class like std::deque
, then you're forcing the common case write usage to be push_fronts
and push_backs
rather than insertions to the middle, so the requirements give you decent foresight as to what to optimize. The common case is embedded into the design, and there's no way the design would ever want to be any different. For higher-level designs, you're usually not so lucky. And even in the cases where you know the broad common case in advance, knowing the micro-level instructions that cause slowdowns is too often incorrectly guessed, even by experts, without a profiler. So the first thing any developer should be interested in when thinking about efficiency is a profiler.
But here's some tips if you do run into a hotspot with a profiler.
Most of the time, the biggest micro-level hotspots if you have any will relate to memory access. So if you have a large object that is just one contiguous block where all the members are getting accessed in some tight loop, it'll aid performance.
For example, if you have an array of 4-component mathematical vectors you're sequentially accessing in a tight algorithm, you'll generally fare far, far better if they're contiguous like so:
x1,y1,z1,w1,x2,y2,x2,w2...xn,yn,zn,wn
... with a single-block structure like this (all in one contiguous block):
x
y
z
w
This is because the machine will fetch this data into a cache line which will have the adjacent vectors' data inside of it when it's all tightly packed and contiguous in memory like this.
You can very quickly slow down the algorithm if you used something like std::vector
here to represent each individual 4-component mathematical vector, where every single individual one stores the mathematical components in a potentially completely different place in memory. Now you could potentially have a cache miss with each vector. In addition, you're paying for additional members since it's a variable-sized container.
std::vector
is a "2-block" object that often looks like this when we use it for a mathematical 4-vector:
size
capacity
ptr --> [x y z w] another block
It also stores an allocator but I'll omit that for simplicity.
On the flip side, if you have a big "1-block" object where only some of its members get accessed in those tight, performance-critical loops, then it might be better to make it into a "2-block" structure. Say you have some Vertex
structure where the most-accessed part of it is the x/y/z position but it also has a less commonly-accessed list of adjacent vertices. In that case, it might be better to hoist that out and store that adjacency data elsewhere in memory, perhaps even completely outside of the Vertex
class itself (or merely a pointer), because your common case, performance-critical algorithms not accessing that data will then be able to access more contiguous vertices nearby in a single cache line since the vertices will be smaller and point to that rarely-accessed data elsewhere in memory.
When rapid creation and destruction of objects is a concern, you can also do better to create each object in a contiguous memory block. The fewer separate memory blocks per object, the faster it'll generally go (since whether or not this stuff is going on the heap or stack, there will be fewer blocks to allocate/deallocate).
So far I've been talking more about contiguity than stack vs. heap, and it's because stack vs. heap relates more to client-side usage of an object rather than an object's design. When you're designing the representation of an object, you don't know whether it's going on the stack or heap. What you do know is whether it's going to be fully contiguous (1 block) or not (multiple blocks).
But naturally if it's not contiguous, then at least part of it is going on the heap, and heap allocations and deallocations can be enormously expensive if you are relating the cost to the hardware stack. However, you can mitigate this overhead often with the use of efficient O(1) fixed allocators. They serve a more special purpose than malloc
or free
, but I would suggest concerning yourself less with the stack vs. heap distinction and more about the contiguity of an object's memory layout.
Last but not least, if you are copying/swapping/moving objects a lot, then the smaller they are, the cheaper this is going to be. So you might want to sort pointers or indices to big objects sometimes, for example, instead of the original object, since even a move constructor for a type T where sizeof(T) is a large number is going to be expensive to copy/move.
So move constructing something like the "2-block" std::vector
here which is not contiguous (its dynamic contents are contiguous, but that's a separate block) and stores its bulky data in a separate memory block is actually going to be cheaper than move constructing like a "1-block" 4x4 matrix that is contiguous. It's because there's no such thing as a cheap shallow copy
if the object is just one big memory block rather than a tiny one with a pointer to another. One of the funny trends that arises is that objects which are cheap to copy are expensive to move, and ones which are very expensive to copy are cheap to move.
However, I would not let copying/move overhead impact your object implementation choices, because the client can always add a level of indirection there if he needs for a particular use case that taxes copies and moves. When you're designing for memory layout-type micro-efficiency, the first thing to focus on is contiguity.
The rule for optimization is this: if you have no code or no tests or no profiling measurements, don't do it. As others have wisely suggested, your number one concern is always productivity (which includes maintainability, safety, clarity, etc). So instead of trapping yourself in hypothetical what-if scenarios, the first thing to do is to write the code, measure it twice, and change it if you really have to do so. It's better to focus on how to design your interfaces appropriately so that if you do have to change anything, it'll just affect one local source file.
Upvotes: 4
Reputation: 52622
The reality is that this is a micro-optimisation. You should write the code to make it readable, maintainable and robust. If you worry about speed, you use a profiling tool to measure the speed. You find things that take more time than they should, and then and only then do you worry about speed optimisation.
An object should obviously only exist once. If you make multiple copies of an object that is expensive to copy you are wasting time. You also have different copies of the same object, which is in itself not a good thing.
"Move semantics" avoids expensive copying in cases where you didn't really want to copy anything but just move an object from here to there. Google for it; it is quite an important thing to understand.
Upvotes: 4
Reputation: 385395
What you said is essentially correct. However, move semantics alleviate the concern about object copying in a large number of cases.
Upvotes: 0