Reputation: 275
Suppose you have the following code:
const size_t size = 5;
int array[size]{1,2,3,4,5}; // ok to initialize since size is const
size_t another_size = 5;
int another_array[another_size]; // can't do int another_array[another_size]{1,2,3,4,5};
another_array[0] = 1;
another_array[1] = 9090;
another_array[2] = 76;
another_array[3] = 90;
another_array[4] = 100;
Since array
is created with a const size, it is able to be initialized. another_array
, however, cannot be initialized because it does not have a const size.
If I am able to assign values to another_array
after declaring the array, why am I not able to initialize another_array
in the first place? Shouldn't the compiler know the size? What is created by array
and another_array
when the code is run? I would assume the fact that the compiler lets you create another_array
with a non-const
size means the compiler does know size?
Upvotes: 1
Views: 446
Reputation: 1159
The comments section addressed how to get variable-length arrays by using std::vector
. I want to take a closer look at exactly what is happening and why.
To answer your question(s), yes, the compiler does know -and knows it knows- the value of another_size
. For simplicity's sake, we will tackle the most basic concepts in this answer first, and then we will build out pedagogically from there, so for starters, consider the following code:
#include <iostream>
int main()
{
std::size_t n = 5;
int array[n] { 1, 2, 3, 4, 5 };
for (auto i = 0; i < 10; ++i) {
std::cout << array[i] << ' ';
}
}
On gcc 7.3, this produces the following output:
[-std=c++17 -Wall -Wextra -Weffc++ -pedantic -O3
]
<source>: In function 'int main()':
<source>:9:16: ISO C++ forbids variable length array 'array' [-Wvla]
int array[n] { 1, 2, 3, 4, 5 };
^
Compiler returned: 0
If you'll notice, the error message from the compiler says nothing about not recognizing the another_size
identifier or perhaps even being passed a nonsensical value because it may have been hypothetically uninitialized or initialized badly.
The error simply says:
ISO C++ forbids variable length array 'array' [-Wvla]
Oddly enough, that's exactly what it means. The problem is not that the compiler thinks you're missing an expression for the array's size, because when your program was compiled, the lexer tokenized the file and the parser generated a tree representing the semantics deduced from your code's syntax. You'd be surprised how much the compiler can deduce from your code, and it's well aware of the identifier another_size
, as well as the associated value (5). However, the C++ standard explicitly disallows variable-length arrays, and for good reason, as we'll see soon. The actual restriction though, is one that could be considered an "artificial" one since it does not actually stem from a technological limitation on the compiler's ability to deduce your intent.
In addition to all of the above, a lot of times you don't really know how much stack space you have available, so allocating an array of size n
is playing Russian Roulette with memory bugs that will be extremely difficult to find. (also this)
As a corollary to my previous point, if you are actually keeping track of how much stack space you have, I dare say you're not programming at the right level of abstraction.
If this restriction is imposed by the standard rather than by a technological limitation, the logical follow-up question is "why?"
Well, first of all, we have to address the principal problem with allowing variable-length arrays: it's not primarily about the developer coding non-const values in the source. (Although this is wrong, see: What is a magic number and why is it bad and const-correctness) The problem really revolves around the fact that if you can set the size of a stack-allocated array based on a non-const value, then surely, by Murphy's Law, the Law of Large Numbers, etc., some poor, hapless, unsuspecting but well-meaning junior developer will allow a user to enter the size of the array themselves, and we're off to the races. Conversely, requiring array sizes to be either integer literals or const variables disallows this.
Interestingly, variable-length arrays are actually legal in other languages, most notably in C as of the C99 standard. Even there though, they are discouraged. The biggest problem with variable length arrays is that they are stack-allocated, and while stack-allocation is normally considered a good thing, in this case it represents a liability.
Stack-smashing has been mitigated as a vulnerability thanks to things like address space layout randomization and an increased awareness of the risks involved, but it's far from a solved problem. As it relates to this specific case, the accepted practice when receiving input from the user is to restrict the number of bytes written into the passed in buffer. One of the advantages we as developers have in this case is the knowledge of how big this buffer actually is. The last thing we want is to give a potential intruder the ability to set the size of a stack-allocated array themselves.
What's more, getting user input is extremely risky and a lot of care needs to be taken to properly sanitize and contain the input. Having a variable-length array that requires a runtime value to be entered to set its size is just one more opportunity for something to go wrong.
To answer this question, consider the following code:
#include <iostream>
int main()
{
std::size_t n = 5;
int array[n] { 1, 7, 5, 0, 1 };
for (auto i = 0; i < 5; ++i) {
std::cout << array[i] << ' ';
}
}
As you can see, we've stack allocated a non-const value and initialized your array in the exact manner that was giving you the error. My compiler is also warning me about the array, but I've compiled only with -std=c++17 -pedantic -O3
, so compilation continues in spite of this warning, producing the following code, abridged for clarity and brevity:
main:
push rbp
push rbx
sub rsp, 56
movdqa xmm0, XMMWORD PTR .LC0[rip]
lea rbx, [rsp+16]
lea rbp, [rsp+56]
mov DWORD PTR [rsp+32], 1
movaps XMMWORD PTR [rsp+16], xmm0
.L2:
mov esi, DWORD PTR [rbx]
mov edi, OFFSET FLAT:std::cout
add rbx, 4
call std::basic_ostream<char, std::char_traits<char> >::operator<<(int)
lea rsi, [rsp+15]
mov edx, 1
mov rdi, rax
mov BYTE PTR [rsp+15], 32
call std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long)
cmp rbx, rbp
jne .L2
add rsp, 56
xor eax, eax
pop rbx
pop rbp
ret
_GLOBAL__sub_I_main:
sub rsp, 8
mov edi, OFFSET FLAT:std::__ioinit
call std::ios_base::Init::Init()
mov edx, OFFSET FLAT:__dso_handle
mov esi, OFFSET FLAT:std::__ioinit
mov edi, OFFSET FLAT:std::ios_base::Init::~Init()
add rsp, 8
jmp __cxa_atexit
.LC0:
.long 1
.long 7
.long 5
.long 0
I encourage you to try this on your own, producing your own assembly code (use -S
for assembly and -masm=intel
, default is at&t syntax). While I won't include the version of this code using the const
modifier on n
, the code is exactly the same. Not basically exactly the same, literally exactly the same, at least on gcc with these options.
Also, I want to clarify that if you were to compile this code with optimizations disabled, you might get more intuitive results, in the sense that there might be more of a one to one correspondence between the code you write and the assembly instructions the compiler outputs. That being said, I think analyzing a fully optimized program, even if its only a toy example, is much more useful, since it'll help you get a feel for what optimizations the compiler uses, especially since x84-64 differs from x86 in some non-trivial ways. Also, some assembly instructions implicitly reference specific registers, which can be confusing if you're not expecting it.
So what does this code mean though? Let's break it down.
Upon entering main
, the rbp
and rbx
registers are pushed onto the stack. Recall that in x86-64 rbp
can be used as a general-purpose register and does not have to act as the base pointer. Instead, the processor uses rsp
to support function calls and returns.
Having freed up the rbp
and rbx
registers, we now proceed to actually allocating the stack. As we mentioned in the beginning, the compiler knows exactly what you meant when you assigned a non-const value as the size of the another_array
array. Dutifully, the stack allocates the necessary space for main
with the sub rsp, 56
command.
Remember that rsp
holds a memory address, so when we subtract 56 from rsp
, we are moving it down a value of 56. In a 64 bit architecture, this will represent 7 bytes of stack allocation, since the stack grows down.
After allocating the stack memory, we see this line:
movdqa xmm0, XMMWORD PTR .LC0[rip]
The movdqa
instruction means Move Aligned Double Quadword, fancy speak for move 128 bits from somewhere to the xmm0
register. There are a few things to point out here. First of all, the movdqa
instruction takes an xmm
register for both its source and destination. As you can see, the source is being "cast", if you will, from the .LC0
address. This cast is necessary because the instruction expects a source size of 128 bits, while an address is represented by 64 bits in x86-64. Also, notice how I used "cast" in quotation marks? That's because casting in assembly language is about size, not type itself. There is no type checking in vanilla assembly language; it's an abstraction provided by the programming language you're working in. In fact, the number of parameters you pass into a function are also not compared with the declared arity of the function. This is another safeguard provided by your language's compiler. The code you write will just execute, and probably cause a segmentation fault if you messed something up.
Historical note: Back in the old days, this was a huge deal because you were afforded no memory protection by the OS or processor. If you wrote a program the accidentally allocated or wrote to too much memory, it was very possible to overwrite not just your personal stuff like documents and programs, but your kernel as well. We have the luxury of protected mode and virtual memory nowadays, but interestingly, computers still start in real mode and then initializes into protected mode.
Going to back to the movdqa
instruction, it's interesting that the compiler chose to use an xmm
register for this program. As you can see from our C++ code, our array only holds integers, so why use a floating-point register? The compiler took advantage of packing, where it stuffed all of our numbers into a single register. If you'll notice as well, in the .LC0
directive, there are only four elements defined, even though we have five integers declared in our program. The compiler optimized away one of the one' and cast each of the remaining four values as a long
.
This is perfect because the xmm
registers in x64 are 128 bits. The C++ standard defines long
as being "at least 32 bits", and it certainly looks like that's the case here. These four 32 bit long
s are now packed into a single 128 bit register.
Going back to our analysis, the next two instructions are pretty straightforward:
lea rbx, [rsp+16]
lea rbp, [rsp+36]
The lea
instruction loads an effective address, in this case [rsp+16]
. This is useful because we're passing addresses relative to the stack pointer.
Now, it might not be immediately obvious, but [rsp+16]
is the first element of the array and [rsp+36]
is the last. In .L2
you can see that the program makes a call to cmp rbx, rbp
. It's testing whether the address that rbp
points to is equal to the address that rbx
points to. If the result is false, the instruction pointer moves back to the start of .L2
, increments rbx
by 4 bytes (thus making it equal the next value in this array of integers), and repeating the loop again.
This isn't specific to your question regarding the array, so I'll fast forward, but I do want to hit two points real quick:
First, notice that if cmp rbx, rbp
is true, we skip over the jump back to .L2
. We then deallocate the stack memory we previously allocated by adding 56 to rsp
.
Second, notice this last call: xor eax, eax
. In x86, the calling convention is the place the result of the function into eax
. Since main
returns 0 by default upon successful execution, a logical exclusive-or operation on the same register will always equal zero. We then pop rbx
and rbp
from the stack and return.
To summarize, VLAs afford you really no additional benefit, make code less intuitive to the reader, and can represent likely (and costly) vulnerable attack vectors, but using them is possible, as the limitation is established by the standard and not by the technology.
Upvotes: 2