user3482098
user3482098

Reputation:

memcpy behaves differently with optimization flags compared to without

Consider this demo programme:

#include <string.h>
#include <unistd.h>

typedef struct {
    int a;
    int b;
    int c;
} mystruct;

int main() {
    int TOO_BIG = getpagesize();
    int SIZE = sizeof(mystruct);
    mystruct foo = {
        123, 323, 232
    };

    mystruct bar;
    memset(&bar, 0, SIZE);
    memcpy(&bar, &foo, TOO_BIG);
}

I compile this two ways:

  1. gcc -O2 -o buffer -Wall buffer.c
  2. gcc -g -o buffer_debug -Wall buffer.c

i.e. the first time with optimizations enabled, the second time with debug flags and no optimization.

The first thing to notice is that there are no warnings when compiling, despite getpagesize returning a value that will cause buffer overflow with memcpy.

Secondly, running the first programme produces:

*** buffer overflow detected ***: terminated
Aborted (core dumped)

whereas the second produces

*** stack smashing detected ***: terminated
Aborted (core dumped)

or, and you'll have to believe me here since I can't reproduce this with the demo programme, sometimes no warning at all. The programme doesn't even interrupt, it runs as normal. This was a behaviour I encountered with some more complex code, which made it difficult to debug until I realised that there was a buffer overflow happening.

My question is: why are there two different behaviours with different build flags? And why does this sometimes execute with no errors when built as a debug build, but always errors when built with optimizations?

Upvotes: 0

Views: 585

Answers (3)

anastaciu
anastaciu

Reputation: 23822

..I can't reproduce this with the demo program, sometimes no warning at all...

The undefined behavior directives are very broad, there is no requirement for the compiler to issue any warnings for a program that exhibits this behavior:

why are there two different behaviours with different build flags? And why does this sometimes execute with no errors when built as a debug build, but always errors when built with optimizations?

Compiler optimizations tend to optimize away unused variables, if I compile your code with optimizations enabled I don't get a segmentation fault, looking at the assembly (link above), you'll notice that the problematic variables are optimized away, and memcpy doesn't get called, so there is no reason for it to not compile successfuly, the program exits with success code 0, whereas if don't optimize it, the undefined behavior manifests itself, and the program exits with code 139, classic segmentation fault exit code.

As you can see these results are different from yours and that is one of the features of undefined behavior, different compilers, systems or even compiler versions can behave in a completely different way.

Upvotes: 1

Lundin
Lundin

Reputation: 214310

The first thing to notice is that there are no warnings when compiling, despite getpagesize returning a value that will cause buffer overflow with memcpy.

That is the programmer's responsibility to fix, not the compiler. You'll be very lucky if a compiler manages to find potential buffer overflows for you. Its job is to check that your code is valid C then translate it to machine code.

If you want a tool that catches bugs, they are called static analysers and that's a different type of program. At some extent, static analysis might be integrated in a compiler as a feature. There is one for clang, but most static analysers are commercial tools and not open source.

Secondly, running the first programme produces: ... whereas the second produces

Undefined behavior simply means there is no defined behavior. What is undefined behavior and how does it work?. Meaning there's not likely anything to learn from examining the results, no interesting mystery to solve. In one case it apparently accessed forbidden memory, in the other case it mangled a poor little "stack canary". The difference will be related to different memory layouts. Who cares - bugs are bugs. Focus on why the bug happened (you already know!), instead of trying to make sense of the undefined results.

Now when I run your code with optimizations actually enabled for real (gcc -O2 on an x86 Linux), the compiler gives me

main:
        subq    $8, %rsp
        call    getpagesize
        xorl    %eax, %eax
        addq    $8, %rsp
        ret

With optimizations actually enabled, it didn't even bother calling memcpy & friends because there are no side effects and the variables aren't used, so they can be safely removed from the executable.

Upvotes: 1

Sam Henke
Sam Henke

Reputation: 399

Accessing memory behind what's been allocated is undefined behavior, which means the compiler is allowed to do anything. When there are no optimizations, the compiler may try to guess and do something reasonable. When optimizations are turned on, the compiler may take advantage of the fact that any behavior is allowed to do something that runs faster.

Upvotes: 1

Related Questions