Ascelhem
Ascelhem

Reputation: 423

Interesting program behaviour

So I have a little program

#include <iostream>
using namespace std;

void lol() {
    cout << "How did we get here?"<<std::endl;
}

int main()
{
   long a, b, z[10];
   cin >> a >> b;
   z[a] = b;
}

You can run it via online compiler here

The program has no purpose, but it has one bug or feature - I do not know what it is. So, if you write something like this main 13 2015 you'll probably get nothing, but if you enter two magic number 13 and 4196608 you'll get an error. Moreover the program executes function void lol() and prints the line How did we get here?.

I've run nm ./main and found my function void lol() with the address 0000000000400900 which equals 4196608(base of the system of numeration is 10).

That means that the program "jumps" for some reason to this address and executes the function void lol(). Moreover, if I change the first number, nothing will happen. main 10 4196608, main 11 4196608, main 12 4196608, main 14 4196608, main 15 4196608 -- all the same, no errors, but as soon as I enter number 13 I get this interesting behaviour.

Can anyone explain what's going on here?

Upvotes: 0

Views: 98

Answers (2)

If the input for a is a number above 9 (or negative), you are accessing z[a] incorrectly (out of bound index, buffer overflow) since you declared an array long z[10]

This is typical undefined behavior (UB).

UB is very bad, see this answer of mine, or for more background:

The only way to explain some actual undefined behavior is to dive into all the implementation specific details (compiler, optimization, operating system, machine code, processor, etc....). You could spend years on this. (perhaps in your case the return address on the call stack has been overwritten by the address of lol function).

Upvotes: 7

Andr&#233; Puel
Andr&#233; Puel

Reputation: 9179

Using the information that Basile Starynkevitch gave us, I made some experimentation whose results strongly suggests that you are messing with the return address.

I created an intermediate function main2() which will return to main(), so we know what we expect in the stack position regarding the return address of the function. My code prints the previous value in z[a] and I compare it with the memory position of the caller, i.e. the main() function:

#include <iostream>
#include <string>
using namespace std;

void lol() {
    cout << "How did we get here?"<<std::endl;
}

int main(int argc, char** argv);

void main2()
{
   long a, b, z[10];
   b = reinterpret_cast<long>(&lol);
   a = 15; //The offset depends on your machine, I found out 15 by trial and error
   std::cout << "z[a] was " << z[a] << std::endl;
   std::cout << "main() was " << reinterpret_cast<long>(&main) << std::endl;
   z[a] = b;
}

int main(int argc, char** argv) {
    main2();
} 

The output is:

z[a] was 4196677
main() was 4196657
How did we get here?
Segmentation fault (core dumped)

I dont know the size of each instruction once compiled to x86 64bits, but the th instruction of main implementation in the asssembly, is the call instruction:

main:
.LFB1022:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    subq    $16, %rsp
    movl    %edi, -4(%rbp)
    movq    %rsi, -16(%rbp)
    call    _Z5main2v

which could explain the offset of 20 bytes in main address and what we had originally in z[a].

Upvotes: 1

Related Questions