Alasdair
Alasdair

Reputation: 14161

C++ variables and where they are stored in memory (stack, heap, static)

I've just begun learning C++ and I wanted to get my head around the different ways to create variables and what the different keywords mean. I couldn't find any description that really went through it, so I wrote this to try to understand what's going on. Have I missed anything? Am I wrong about anything?

Global Variables

Global variables are stored neither on the heap nor stack. static global variables are non-exported (standard global variables can be accessed with extern, static globals cannot)

Dynamic Variables

Any variable that is accessed with a pointer is stored on the heap. Heap variables are allocated with the new keyword, which returns a pointer to the memory address on the heap. The pointer itself is a standard stack variable.

Variables inside {} that aren't created with new

Stored in the stack, which is limited in size so it should be used only for primitives and small data structures. static keyword means the variable is essentially global and stored in the same memory space as global variables, but scope is restricted to this function/class. const keyword means you can't change the variable. thread_local is like static but each thread gets its own variable.

Register

A variable can be declared as register to hint to the compiler that it should be stored in the register. The compiler will probaby ignore this and apply it to whatever it thinks would be the best improvement. Typical usage would be for an index or pointer being used as an interator in a loop.

Good practice

  1. Use const by default when applicable, its faster.
  2. Be wary of static and globals in multithreaded applications, instead use thread_local or mutex
  3. Use register on iterators

Notes

Any variables created inside a function (non-global) that is not static or thread_local and is not created with new, will be on the stack. Stack variables should not exceed more than a few KB in memory, otherwise use new to put it on the heap.

The full available system memory can be used for variables with static keyword, thread_local keyword, created with new, or global.

Variables created with new need to be freed with delete. All others are automatically freed when they're out of scope, except static, thead_local and globals which are freed when the program ends.

Despite all the parroting about how globals should not be used, don't be bullied: they are great for some use cases, and more efficient than variables allocated on the heap. Mutexes will be needed to avoid race conditions in multi-threaded applications.

Upvotes: 1

Views: 3549

Answers (3)

t.niese
t.niese

Reputation: 40882

You need to differentiate between specification and implementation. The specification does not say anything about stack and heap, because that's an implementation detail. They purposely talk about Storage duration.

How this storage duration is achieved depends on the target environment and if the compiler needs to do allocations those at all or if these values can be determined at the compile-time, and are then only part of the machine code (which for sure is also at some part of the memory).

So most of your descriptions would be For the target platform XY it will generally allocate on stack/heap if I do XY

C++ could also be used as an interpreted language e.g. cling that could have completely different ways of handling memory.

It could be cross-compiled to some kind of byte interpreter in which every type is dynamically allocated.

And when it comes to embedded systems the way how memory is managed/handled might be even more different.

Heap variables are allocated with the new keyword, which returns a pointer to the memory address on the heap.

If the default operator new, operator new[] are mapped to something like malloc (or any other equivalent in the given OS) this is likely the case (if the object really needs to be allocated).

But for embedded systems, it might be the case that operator new, operator new[] aren't implemented at all. The "OS" just might provide you a chunk of memory for the application that is handled like stack memory for which you manually reserve a certain amount of memory, and you implement a operator new and operator new[] that works with this preallocated memory, so in such a case you only have stack memory.

Besides that, you can create a custom operator new for certain classes that allocates the memory on some hardware that is different to the "regular" memory provided by the OS.

The std::vector is allocating the memory in the same memory space that new is allocating it, i.e. the heap, or its not? This is important because it changes how I use it.

A std::vector is defined as template<class T, class Allocator = std::allocator<T>> class vector; so there is a default behavior (that is given by the implementation) where the vector allocates memory, for common Desktop OS it uses something like OS call like malloc to dynamically allocate memory. But you could also provide a custom allocator that uses memory at any other addressable memory location (e.g. stack).

Upvotes: 1

Yksisarvinen
Yksisarvinen

Reputation: 22394

The other answer is correct, but doesn't mention the use of register.

The compiler will probaby ignore this and apply it to whatever it thinks would be the best improvement.

This is correct. Compilers are so good at choosing variables to put in registers (and typical programmer is bad at that), that C++ committees decided it's completely useless.

This keyword was deprecated in C++11 and removed in C++17 (but it's still reserved for possible future use).

Do not use it.

Upvotes: 2

al45tair
al45tair

Reputation: 4433

Mostly right.

Any variable that is accessed with a pointer is stored on the heap.

This isn't true. You can have pointers to stack-based or global variables.

Also it's worth pointing out that global variables are generally unified by the linker (i.e. if two modules have "int i" at global scope, you'll only have one global variable called "i"). Dynamic libraries complicate that slightly; on Windows, DLLs don't have that behaviour (i.e. an "int i" in a Windows DLL will not be the same "int i" as in another DLL in the same process, or as the main executable), while most other platforms dynamic libraries do. There are some additional complications on Darwin (iOS/macOS) which has a hierarchical namespace for symbols; as long as you're linking with the flat_namespace option, what I just said will hold.

Additionally, it's worth talking about initialisation behaviour; global variables are initialised automatically by the runtime (typically either using special linker features or by means of a call that is inserted into the code for your main function). The order of initialisation of globals isn't guaranteed. However, static variables declared at function scope are initialised when that function is first executed, and not at program start-up as you might suppose, and that feature is commonly used by C++ programmers to do lazy initialisation.

(Similar concerns apply to destructors for global objects; those are best avoided entirely IMO, not least because on some platforms there are fast termination features that simply won't call them.)

const keyword means you can't change the variable.

Almost. const affects the type, and there is a difference depending on where you write it exactly. For example

const char *foo;

should be read as foo is a pointer to a const char, i.e. foo itself is not const, but the thing it points at is. Contrast with

char * const foo;

which says that foo is a const pointer to char.

Finally, you've missed out volatile, the point of which is to tell the compiler not to make assumptions about the thing to which it applies (e.g. it can't assume that it's safe to cache a volatile value in a register, or to optimise away accesses, or in general to optimise across any operation that affects a volatile value). Hopefully you'll never need to use volatile; it's most often useful if you're doing really low-level things that frankly a lot of people have no need to go anywhere near.

Upvotes: 2

Related Questions