Antonio Perez
Antonio Perez

Reputation: 493

Smallest executable program (x86-64 Linux)

I recently came across this post describing the smallest possible ELF executable for Linux, however the post was written for 32 bit and I was unable to get the final version to compile on my machine.

This brings me to the question: what's the smallest x86-64 ELF executable it's possible to write that runs without error?

Abusing or violating the ELF specification is ok as long as current Linux kernels in practice will run it, as with the last couple versions of the MuppetLabs 32-bit teensy ELF executable article.

Upvotes: 14

Views: 7237

Answers (3)

MestreLion
MestreLion

Reputation: 13736

Most articles out there give up on ld and resort to hand-crafting the ELF headers waaay too soon, including the amazing answer from Matteo Italia.

I've discovered you can get to the standard ELF header + program code 120-ish bytes limit using only standard tools, no need to insert the ELF header in your ASM

Standard assembly code, with a few tricks:

; tiny.asm
BITS 64
SECTION .text align=1
GLOBAL _start
_start:
    ; _exit(42)
    ; all registers zeroed by Linux ABI at start, so safe to use al/dil
    mov       al, 60  ; Select the _exit syscall (60 in Linux ABI)
    mov      dil, 42  ; Set the exit code argument for _exit
    syscall           ; Perform the selected syscall

Remarks:

  • Using al/dil instead of the common eax/edi or the naive rax/rdi, for a 7-byte code payload. This is fine as Linux ABI guarantees all registers to be zero on program start.
  • align=1 so ld with its default linker script for a non-PIE can pick 0x400078 as the program entry point address, putting the payload right after the ELF header. As explained by @ecm, NASM's default is align=16, which makes ld use 8 bytes of padding to get to 0x400080.

And now some fine-tuned command-line arguments:

nasm -f elf64 tiny.asm &&
ld -s -no-pie -z noseparate-code tiny.o -o tiny

Results:

$ wc -c tiny && ./tiny; echo $?
336 tiny
42

That's already better than the 352 bytes Matteo had in his last ld attempt. And the code payload only accounts for 3 out of the 18 bytes saved.

But the payload is not the point here. My goal is to get rid of all section headers, so we get to the 120+payload size which is the absolute minimum before manually fiddling with the ELF header.

Given our 7-byte payload, we aim for a 127-byte binary, breaking the ~300 bytes barrier.

$ strip --strip-section-headers tiny && wc -c tiny && ./tiny; echo $?
127 tiny
42

A 62% reduction with a single strip, and goal achieved!

$ readelf -Wa tiny
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x400078
  Start of program headers:          64 (bytes into file)
  Start of section headers:          0 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         1
  Size of section headers:           0 (bytes)
  Number of section headers:         0
  Section header string table index: 0

There are no sections in this file.

There are no section groups in this file.

Program Headers:
 Type  Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
 LOAD  0x000000 0x0000000000400000 0x0000000000400000 0x00007f 0x00007f R E 0x1000

There is no dynamic section in this file.

There are no relocations in this file.
No processor specific unwind information to decode

Dynamic symbol information is not available for displaying symbols.

No version information found in this file.

This is a result Matteo and most articles only achieved by pasting the ELF header in ASM and editing the loader address by hand, but now we tamed nasm, ld and strip to do it automatically for us.

And, for completeness, the 4-byte payload true clone that yields impressive 124 bytes in just 5 lines, which I believe is the smallest possible size before non-standard approaches like overlapping headers and embedding the payload in it:

SECTION .text align=1
GLOBAL _start
_start:
    mov       al, 60
    syscall
nasm -f elf64 tiny.asm &&
ld -s -no-pie -z noseparate-code tiny.o -o tiny &&
strip --strip-section-headers tiny && wc -c tiny && ./tiny; echo $?
124 tiny
0

A tiny executable and a tiny source!

Upvotes: 4

n132
n132

Reputation: 31

Updated Answer

After seeing the tricks used in @Matteo Italia's answer, I found it's possible to reach 112 bytes since we can not only hide the string but also the code in the EFL header.

Explanations: The key idea is hiding everthing to the header, including string "Hello World!\n" and the code to print the string. We should first test what part of the header is modifiable (aka modify the value and the program can still be executed). Then, we hide our data and code in header as following code shows: (compile with command nasm -f bin ./x.asm)

  • This source code is based on @Matteo Italia's answer but completes the part he didn't show, of printing Hello World as well as exiting. There doesn't seem to be a way to make it any shorter; the kernel requires the file to be big enough to contain the ELF headers.
  • This version has some nop instructions in other space that's available for use inside / between the ELF headers which we can't avoid. We still have space to waste in p_paddr and p_align.
bits 64
            org 0x08048000

ehdr:                                           ;   Elf64_Ehdr
            db  0x7F, "ELF",                    ;   e_ident
_start:
            mov dl, 13
            mov esi,STR
            pop rax
            syscall
            jmp _S0
            dw  2                               ;   e_type
            dw  62                              ;   e_machine
            dd  0xff                            ;   e_version
            dq  _start                          ;   e_entry
            dq  phdr - $$                       ;   e_phoff
STR:
            db "Hello Wo"                       ;   e_shoff
            db "rld!"                           ;   e_flags
            dw  0x0a                            ;   e_ehsize, ther place where we hide the next line symbol
            dw  phdrsize                        ;   e_phentsize
phdr:                                           ;   Elf64_Phdr
            dw  1                               ;   e_phnum         p_type
            dw  0                               ;   e_shentsize
            dw  5                               ;   e_shnum         p_flags
            dw  0                               ;   e_shstrndx
ehdrsize    equ $ - ehdr
            dq  0                               ;   p_offset
            dq  $$                              ;   p_vaddr
_S0:
            nop                  ; unused space for more code
            nop
            nop
            nop
            nop                                 
            nop                                 
            jmp _S1                             ;   p_paddr, These 8 bytes belong to p_paddr, I nop them to show we can add some asm code here
            dq  filesize                        ;   p_filesz
            dq  filesize                        ;   p_memsz
_S1:
            mov eax,60 ; p_align[0:5]
            syscall    ; p_align[6:7]
            nop        ; p_align[7:8]

phdrsize    equ     $ - phdr
filesize    equ     $ - $$

Original Post:

I have a 129-byte x64 "Hello World!".

Step1. Compile the following asm code with nasm -f bin hw.asm

; hello_world.asm
  BITS 64
  org 0x400000

  ehdr:           ; Elf64_Ehdr
    db 0x7f, "ELF", 2, 1, 1, 0 ; e_ident
    times 8 db 0
    dw  2         ; e_type
    dw  0x3e      ; e_machine
    dd  1         ; e_version
    dq  _start    ; e_entry
    dq  phdr - $$ ; e_phoff
    dq  0         ; e_shoff
    dd  0         ; e_flags
    dw  ehdrsize  ; e_ehsize
    dw  phdrsize  ; e_phentsize
  phdr:           ; Elf64_Phdr
    dd  1         ; e_phnum      ; p_type
                  ; e_shentsize
    dd  5         ; e_shnum      ; p_flags
                  ; e_shstrndx
  ehdrsize  equ  $ - ehdr
    dq  0         ; p_offset
    dq  $$        ; p_vaddr
    dq  $$        ; p_paddr
    dq  filesize  ; p_filesz
    dq  filesize  ; p_memsz
    dq  0x1000    ; p_align
  phdrsize  equ  $ - phdr
  
  _start:
    ; write "Hello World!" to stdout
    pop rax
    mov dl, 60
    mov esi, hello
    syscall
    syscall

  hello: db "Hello World!", 10 ; 10 is the ASCII code for newline

  filesize  equ  $ - $$

Step2. Modify it with following python script

from pwn import *
context.log_level='debug'
context.arch='amd64'
context.terminal = ['tmux', 'splitw', '-h', '-F' '#{pane_pid}', '-P']
with open('./hw','rb') as f:
    pro = f.read()
print(len(pro))
pro = list(pro)
cut = 0x68
pro[0x18]  = cut
pro[0x74]  = 0x7c-(0x70-cut)
pro = pro[:cut]+pro[0x70:]
print(pro)
x = b''
for _ in pro:
    x+=_.to_bytes(1,'little')
with open("X",'wb') as f:
    f.write(x)

You should a 129-byte "Hello World".

[18:19:02] n132 :: xps  ➜  /tmp » strace ./X
execve("./X", ["./X"], 0x7fffba3db670 /* 72 vars */) = 0
write(0, "Hello World!\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 60Hello World!
) = 60
exit(0)                                 = ?
+++ exited with 0 +++
[18:19:04] n132 :: xps  ➜  /tmp » ./X
Hello World!
[18:19:11] n132 :: xps  ➜  /tmp » ls -la ./X
-rwxrwxr-x 1 n132 n132 129 Jan 29 18:18 ./X

Upvotes: 3

Matteo Italia
Matteo Italia

Reputation: 126957

Starting from an answer of mine about the "real" entrypoint of an ELF executable on Linux and "raw" syscalls, we can strip it down to

bits 64
global _start
_start:
   mov di,42        ; only the low byte of the exit code is kept,
                    ; so we can use di instead of the full edi/rdi
   xor eax,eax
   mov al,60        ; shorter than mov eax,60
   syscall          ; perform the syscall

I don't think you can get it to be any smaller without going out of specs - in particular, the psABI doesn't guarantee anything about the state of eax. This gets assembled to precisely 10 bytes (as opposed to the 7 bytes of the 32 bit payload):

66 bf 2a 00 31 c0 b0 3c 0f 05

The straightforward way (assemble with nasm, link with ld) produces me a 352 bytes executable.

The first "real" transformation he does is building the ELF "by hand"; doing this (with some modifications, as the ELF header for x86_64 is a bit bigger)

bits 64
            org 0x08048000

ehdr:                                           ; Elf64_Ehdr
            db  0x7F, "ELF", 2, 1, 1, 0         ;   e_ident
    times 8 db  0
            dw  2                               ;   e_type
            dw  62                              ;   e_machine
            dd  1                               ;   e_version
            dq  _start                          ;   e_entry
            dq  phdr - $$                       ;   e_phoff
            dq  0                               ;   e_shoff
            dd  0                               ;   e_flags
            dw  ehdrsize                        ;   e_ehsize
            dw  phdrsize                        ;   e_phentsize
            dw  1                               ;   e_phnum
            dw  0                               ;   e_shentsize
            dw  0                               ;   e_shnum
            dw  0                               ;   e_shstrndx

ehdrsize    equ $ - ehdr

phdr:                                           ; Elf64_Phdr
            dd  1                               ;   p_type
            dd  5                               ;   p_flags
            dq  0                               ;   p_offset
            dq  $$                              ;   p_vaddr
            dq  $$                              ;   p_paddr
            dq  filesize                        ;   p_filesz
            dq  filesize                        ;   p_memsz
            dq  0x1000                          ;   p_align

phdrsize    equ     $ - phdr

_start:
   mov di,42        ; only the low byte of the exit code is kept,
                    ; so we can use di instead of the full edi/rdi
   xor eax,eax
   mov al,60        ; shorter than mov eax,60
   syscall          ; perform the syscall

filesize      equ     $ - $$

we get down to 130 bytes. This is a tad bigger than the 91 bytes executable, but it comes from the fact that several fields become 64 bits instead of 32.


We can then apply some tricks similar to his; the partial overlap of phdr and ehdr can be done, although the order of fields in phdr is different, and we have to overlap p_flags with e_shnum (which however should be ignored due to e_shentsize being 0).

Moving the code inside the header is slightly more difficult, as it's 3 bytes larger, but that part of header is just as big as in the 32 bit case. We overcome this by starting 2 bytes earlier, overwriting the padding byte (ok) and the ABI version field (not ok, but still works).

So, we reach:

bits 64
            org 0x08048000

ehdr:                                           ; Elf64_Ehdr
            db  0x7F, "ELF", 2, 1,              ;   e_ident
_start:
            mov di,42        ; only the low byte of the exit code is kept,
                            ; so we can use di instead of the full edi/rdi
            xor eax,eax
            mov al,60        ; shorter than mov eax,60
            syscall          ; perform the syscall
            dw  2                               ;   e_type
            dw  62                              ;   e_machine
            dd  1                               ;   e_version
            dq  _start                          ;   e_entry
            dq  phdr - $$                       ;   e_phoff
            dq  0                               ;   e_shoff
            dd  0                               ;   e_flags
            dw  ehdrsize                        ;   e_ehsize
            dw  phdrsize                        ;   e_phentsize
phdr:                                           ; Elf64_Phdr
            dw  1                               ;   e_phnum         p_type
            dw  0                               ;   e_shentsize
            dw  5                               ;   e_shnum         p_flags
            dw  0                               ;   e_shstrndx
ehdrsize    equ $ - ehdr
            dq  0                               ;   p_offset
            dq  $$                              ;   p_vaddr
            dq  $$                              ;   p_paddr
            dq  filesize                        ;   p_filesz
            dq  filesize                        ;   p_memsz
            dq  0x1000                          ;   p_align

phdrsize    equ     $ - phdr
filesize    equ     $ - $$

which is 112 bytes long.

Here I stop for the moment, as I don't have much time for this right now. You now have the basic layout with the relevant modifications for 64 bit, so you just have to experiment with more audacious overlaps

Upvotes: 18

Related Questions