ruslanbyku
ruslanbyku

Reputation: 31

clang AddressSanitizer instructs code improperly, false-positive result

FOREWORD

The current question is pretty damn huge and related to my master thesis, so I am humbly asking for your patience. I encountered a problem that is going to be explained further about half a year ago and the problem was needed an exterior look because at that point I was really stuck and I had nobody to help me. In the end I waved a hand at the problem, but now I am back in business (the second wind, let us put it that way).

INTRODUCTION

Crucial technologies used in the project: C++, llvm/clang 13.0.1, ASAN, libFuzzer

The underlying idea behind the project I was writting is:

  1. Write a parser of C-code projects to find functions that are presumed to be vulnerable (in the frames of the current question it does not matter how I decide that they are vulnerable)
  2. When I find the vulnerable function, I start to write fuzzer code with libFuzzer for the function.
  3. At this point I have an IR file with my vulnerable function, an IR file with my fuzzer code so it is time to perform a separate compilation of two files. During the compilation process I instruct them with ASAN and libFuzzer by the clang compiler.
  4. So the two files are coalesced together and I have an executable called, for example, 'fuzzer'. Theoretically, I can execute this executable and libFuzzer is going to fuzz my vulnerable function.

ACTUAL PROBLEM (PART 1)

ASAN intructs my code somehow bad. It gives me the wrong result. How do I know that? I found and took a vulnerable function. This function is from the old version of libcurl and is called sanitize_cookie_path. I reproduced the bug with AFL++ and it gave me what I wanted. If you pass a single quote to the function, it is going to 'blow'. Something similar I wanted to do with libFuzzer and ASAN, but as I mentioned earlier these two did not give me the expected result. Having spent some time on the problem, I can say that there is something with ASAN.

PROBLEM REPRODUCTION

  1. I have the code (see below) in the file sanitize_cookie_path.c:

     #include <stdio.h>
     #include <string.h>
     #include <stdlib.h>
     #include <stdbool.h>
     #include <stddef.h>
    
     static char* sanitize_cookie_path(const char* cookie_path) {
         size_t len;
         char* new_path = strdup(cookie_path);
         if (!new_path) {
             return NULL;
         }
    
         if (new_path[0] == '\"') {
             memmove((void *)new_path, (const void*)(new_path + 1), strlen(new_path));
         }
         if (new_path[strlen(new_path) - 1] == '\"') {
             new_path[strlen(new_path) - 1] = 0x0;
         }
    
         if (new_path[0] !='/') {
             free(new_path);
             new_path = strdup("/");
             return new_path;
         }
    
         len = strlen(new_path);
         if (1 < len && new_path[len - 1] == '/') {
             new_path[len - 1] = 0x0;
         }
    
         return new_path;
     }
    
     int main(int argc, char** argv) {
         if (argc != 2) {
             exit(1);
         }
    
         sanitize_cookie_path('\"');
    
         return 0;
     }
    
  2. My C++ code compiles it with the command:

    clang -O0 -emit-llvm path/to/sanitize_cookie_path.c -S -o path/to/sanitize_cookie_path.ll > /dev/null 2>&1
    
  3. On the IR level of the above code I get rid of the 'main' so only the 'sanitize_cookie_path' function presents.

  4. I generate the simple fuzzer code (see below) for this function:

    #include <cstdio>
    #include <cstdint>
    
    static char* sanitize_cookie_path(const char* cookie_path) ;
    
    extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
         (void) sanitize_cookie_path((char*) data);
    
     return 0;
    }
    
  5. Then I compile it with the command:

    clang -O0 -emit-llvm path/to/fuzz_sanitize_cookie_path.cc -S -o path/to/fuzz_sanitize_cookie_path.ll > /dev/null 2>&1
    
  6. Two IR files are being compiled with the separate compilation. NOTE that before the separate compilation I perform some business to get them fit each other. For instance, I ditch the 'static' keyword and resolve name mangling from C++ to C code.

  7. I compile them both together with the command:

    clang++ -O0 -g -fno-omit-frame-pointer -fsanitize=address,fuzzer -fsanitize-coverage=trace-cmp,trace-gep,trace-div path/to/sanitize_cookie_path.ll path/to/fuzz_sanitize_cookie_path.ll -o path-to/fuzzer > /dev/null 2>&1
    
  8. The final 'fuzzer' executable is ready.

ACTUAL PROBLEM (PART 2)

If you execute the fuzzer program, it is not going to give you the same results as AFL++ gives you. My fuzzer tumbles down on the '__interceptor_strdup' function from some standard library (see error snippet below). The crash report done by libFuzzer is literally empty (0 bytes), but ideally it had to find that the error is with a quote ("). Having done my own research I found out that ASAN did instruct the code bad and it gives me a false-position result. Frankly speaking I can fuzz the 'printf' function from stdio.h and find the same error.

[sanitize_cookie_path]$ ./fuzzer
INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 1016408680
INFO: Loaded 1 modules   (11 inline 8-bit counters): 11 [0x5626d4c64c40, 0x5626d4c64c4b),
INFO: Loaded 1 PC tables (11 PCs): 11 [0x5626d4c64c50,0x5626d4c64d00),
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
=================================================================
==2804==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000011 at pc 0x5626d4ba7671 bp 0x7ffe43152df0 sp 0x7ffe431525a0
READ of size 2 at 0x602000000011 thread T0
    #0 0x5626d4ba7670 in __interceptor_strdup (/path/to/fuzzer+0xdd670)
    #1 0x5626d4c20127 in sanitize_cookie_path (/path/to/fuzzer+0x156127)
    #2 0x5626d4c20490 in LLVMFuzzerTestOneInput (/path/to/fuzzer+0x156490)
    #3 0x5626d4b18940 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/path/to/fuzzer+0x4e940)
    #4 0x5626d4b1bae6 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::vector<fuzzer::SizedFile, fuzzer::fuzzer_allocator<fuzzer::SizedFile> >&) (/path/to/fuzzer+0x51ae6)
    #5 0x5626d4b1c052 in fuzzer::Fuzzer::Loop(std::vector<fuzzer::SizedFile, fuzzer::fuzzer_allocator<fuzzer::SizedFile> >&) (/path/to/fuzzer+0x52052)
    #6 0x5626d4b0100b in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/path/to/fuzzer+0x3700b)
    #7 0x5626d4af0297 in main (/path/to/fuzzer+0x26297)
    #8 0x7f8e6442928f  (/usr/lib/libc.so.6+0x2928f)
    #9 0x7f8e64429349 in __libc_start_main (/usr/lib/libc.so.6+0x29349)
    #10 0x5626d4af02e4 in _start /build/glibc/src/glibc/csu/../sysdeps/x86_64/start.S:115

I used gdb to enter into the strdup(cookie_path). gdb shows me that the fuzzer tumbles down on the address 0x0000555555631687.

0x0000555555631684 <+452>:  mov    %rbp,%rsi
0x0000555555631687 <+455>:  addr32 call 0x555555674100 <_ZN6__asan18ReportGenericErrorEmmmmbmjb>
0x000055555563168d <+461>:  pop    %rax

WHAT I TRIED TO DO

  1. I tried to instuct my sanitize_cookie_path.c and fuzz_sanitize_cookie_path.cc with ASAN right at the beginning, not at the IR level but whatever I did nothing worked.

  2. I passed to the 'fuzzer' the so called corpus directory with pre-cooked data to be passed to the fuzzer. I even passed the quote explicitly to the 'fuzzer', but nothing. Example (with the same directory as the fuzzer):

    $ mkdir corpus/; echo "\"" > corpus/input; hexdump corpus/input
    0000000 0a22                                   
    0000002
    $ ./fuzzer corpus/
    
  3. I also googled everything I could about libFuzzer and ASAN, but nothing gave me the results.

  4. Changed compilation command. I got rid of the '-fno-omit-frame-pointer' and '-fsanitize-coverage=trace-cmp,trace-gep,trace-div'.

If there are some uncertainties in the details I have provided, do not hesitate to ask about them and I will iron them out to be more clear for you.

What are some other sites/forums where I can possibly get heard? I would ideally want to contact the developers of ASAN. I will be more than happy for any help.

UPDATE 04/10/2022

llvm/clang have been upgraded from 13.0.1 to the latest available version in the Arch repository - 14.0.6. The problem still persists.

Opened an issue in the google/sanitizers repository.

Upvotes: 2

Views: 1411

Answers (1)

ruslanbyku
ruslanbyku

Reputation: 31

Once more I have reread my question and comments, looked again at the code and additionally ran into this thought:

AddressSanitizer is not expected to produce false positives. If you see one, look again; most likely it is a true positive!

As @Richard Critten and @chi have correctly pointed out in the comments section strdup function needs a NULL terminated string, so I changed my solution

from

(void) sanitize_cookie_path((char*) data);

to

char* string_ = new char[size + 1];
memcpy(string_, data, size);
string_[size] = 0x0;

(void) sanitize_cookie_path(string_);

delete[] string_;

The about solution converts the raw array of bytes data to a NULL terminated string string_ and passes it to the function. This solution works as it is expected.

It was just a stupid mistake that I had overlooked. Thanks again to @Richard Critten and @chi and everyone that tried to help.

Since there is no bug, I am going to retract my false accusations in google/sanitizers.

Upvotes: 1

Related Questions