Nathan Long
Nathan Long

Reputation: 125972

How could I write "hello world" in binary?

Suppose I wanted to write a program to display "hello world", and I wanted to write it in binary. How could I do this?

I have some idea that:

Can anybody walk me through this?

Upvotes: 24

Views: 59226

Answers (2)

vartec
vartec

Reputation: 134611

It's bit more complicated, because actually printing "Hello, world!" to stdout is a system call, thus you need to know the correct kernel syscall number. Which of course varies by operating system. Also you need to know the binary format, which also tend to vary, although ELF (Executable and Linkable Format) is universal across few flavors of Unix and Linux.

See Hello, world! in assembler.

This is Linux assembler code:

section .text
    global _start           ;must be declared for linker (ld)

_start:                 ;tell linker entry point

    mov edx,len ;message length
    mov ecx,msg ;message to write
    mov ebx,1   ;file descriptor (stdout)
    mov eax,4   ;system call number (sys_write)
    int 0x80    ;call kernel

    mov eax,1   ;system call number (sys_exit)
    int 0x80    ;call kernel

section .data

msg db  'Hello, world!',0xa ;our dear string
len equ $ - msg         ;length of our dear string

... which on 32-bit Linux, compilation results in binary of 360 bytes, although it's mostly zeros:

00000000  7f 45 4c 46 01 01 01 00  00 00 00 00 00 00 00 00  |.ELF............|
00000010  02 00 03 00 01 00 00 00  80 80 04 08 34 00 00 00  |............4...|
00000020  c8 00 00 00 00 00 00 00  34 00 20 00 02 00 28 00  |........4. ...(.|
00000030  04 00 03 00 01 00 00 00  00 00 00 00 00 80 04 08  |................|
00000040  00 80 04 08 9d 00 00 00  9d 00 00 00 05 00 00 00  |................|
00000050  00 10 00 00 01 00 00 00  a0 00 00 00 a0 90 04 08  |................|
00000060  a0 90 04 08 0e 00 00 00  0e 00 00 00 06 00 00 00  |................|
00000070  00 10 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000080  ba 0e 00 00 00 b9 a0 90  04 08 bb 01 00 00 00 b8  |................|
00000090  04 00 00 00 cd 80 b8 01  00 00 00 cd 80 00 00 00  |................|
000000a0  48 65 6c 6c 6f 2c 20 77  6f 72 6c 64 21 0a 00 2e  |Hello, world!...|
000000b0  73 68 73 74 72 74 61 62  00 2e 74 65 78 74 00 2e  |shstrtab..text..|
000000c0  64 61 74 61 00 00 00 00  00 00 00 00 00 00 00 00  |data............|
000000d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000000f0  0b 00 00 00 01 00 00 00  06 00 00 00 80 80 04 08  |................|
00000100  80 00 00 00 1d 00 00 00  00 00 00 00 00 00 00 00  |................|
00000110  10 00 00 00 00 00 00 00  11 00 00 00 01 00 00 00  |................|
00000120  03 00 00 00 a0 90 04 08  a0 00 00 00 0e 00 00 00  |................|
00000130  00 00 00 00 00 00 00 00  04 00 00 00 00 00 00 00  |................|
00000140  01 00 00 00 03 00 00 00  00 00 00 00 00 00 00 00  |................|
00000150  ae 00 00 00 17 00 00 00  00 00 00 00 00 00 00 00  |................|
00000160  01 00 00 00 00 00 00 00                           |........|

Since you want to "compile by hand", this basically means translating assembler mnemonics above to their opcodes, and then wrapping the result in correct binary format (ELF in the example above)

UPDATE: As this answer shows by @adam-rosenfield, the ELF binary for "Hello, world!" can be handcrafted down to 116 bytes. Original answer is now deleted, but still visible to moderators, so here's a copy:

Here's a 32-byte version using Linux system calls:

 .globl _start
_start:
        movb $4, %al
        xor %ebx, %ebx
        inc %ebx
        movl $hello, %ecx
        xor %edx, %edx
        movb $11, %dl
        int $0x80               ;;; sys_write(1, $hello, 11)
        xor %eax, %eax
        inc %eax
        int $0x80               ;;; sys_exit(something) hello:
        .ascii "Hello world" 

When compiled into a minimal ELF file, the full executable is 116 bytes:

00000000  7f 45 4c 46 01 01 01 00  00 00 00 00 00 00 00 00  |.ELF............| 
00000010  02 00 03 00 01 00 00 00  54 80 04 08 34 00 00 00  |........T...4...| 
00000020  00 00 00 00 00 00 00 00  34 00 20 00 01 00 00 00  |........4. .....| 
00000030  00 00 00 00 01 00 00 00  00 00 00 00 00 80 04 08  |................|
00000040  00 80 04 08 74 00 00 00  74 00 00 00 05 00 00 00  |....t...t.......|
00000050  00 10 00 00 b0 04 31 db  43 b9 69 80 04 08 31 d2  |......1.C.i...1.|
00000060  b2 0b cd 80 31 c0 40 cd  80 48 65 6c 6c 6f 20 77  |[email protected] w|
00000070  6f 72 6c 64                                       |orld| 
00000074 

Upvotes: 30

TMN
TMN

Reputation: 3070

Normally, you'd use a hex editor for this. Figure out the assembly code, hand-assemble it, use the hex editor to enter the binary values, then save them to a file. Once you have your file, drop into your machine monitor and load the file at an available address, then jump to the first instruction. This was pretty common practice on single-board computers and is still done on microcontrollers today, but it's not something you're going to do on a contemporary OS. If you really want to do this, I'd recommend running a low-level emulator (SIMH will work) or working with a microcontroller (you can pick up a TI MSP430 development kit for less than five bucks).

Upvotes: 3

Related Questions