Reputation: 3325
I compile the following program with gcc
and receive an output executable file a.out
.:
#include <stdio.h>
int main () {
printf("hello, world\n");
}
When I execute cat a.out
, why is the file in "gibberish" (what is this called?) and not machine language of 0s and 1s:
??????? H__PAGEZERO(__TEXT__text__TEXT?`??__stubs__TEXT
P__unwind_info__TEXT]P]__eh_frame__TEXT?H??__DATA__program_vars [continued]
Upvotes: 0
Views: 846
Reputation: 225132
It's in some kind of executable file format. On Linux, it's probably ELF, on Mac OS X it's probably Mach-O, and so on. There's even an a.out format, but it's not that common anymore.
It can't just be bare machine instructions - the operating system needs some information about how to load it, what dynamic libraries to attach to it, etc.
Upvotes: 4
Reputation: 9216
The file is in 0 and 1, but when you open it with text editor those bits are grouped in bytes and then treated as text ;) In Linux you could try to disassemble the output file to ensure that it contains machine instructions (x86 architecture):
objdump -D -mi386 a.out
Example output:
1: 83 ec 08 sub $0x8,%esp
4: be 01 00 00 00 mov $0x1,%esi
9: bf 00 00 00 00 mov $0x0,%edi
The second column contains that 0's and 1's in hexadecimal notation and the third column contains mnemonic assembler instructions.
If you want to display those 0's and 1's simply type:
xxd -b a.out
Example output:
0000000: 01111111 01000101 01001100 01000110 00000010 00000001 .ELF..
0000006: 00000001 00000000 00000000 00000000 00000000 00000000 ......
Upvotes: 14
Reputation: 213688
The typical format on Linux systems these days is ELF. The ELF file may contain machine code, which you can examine with the objdump
utility.
$ gcc main.c $ objdump -d -j .text a.out a.out: file format elf64-x86-64 Disassembly of section .text: (code omitted for brevity) 00000000004005ac : 4005ac: 55 push %rbp 4005ad: 48 89 e5 mov %rsp,%rbp 4005b0: bf 6c 06 40 00 mov $0x40066c,%edi 4005b5: e8 d6 fe ff ff callq 400490 4005ba: 5d pop %rbp 4005bb: c3 retq 4005bc: 0f 1f 40 00 nopl 0x0(%rax)
See? Machine code. The objdump
utility helpfully prints it in hexadecimal with the corresponding disassempled code on the right, and the addresses on the left.
Upvotes: 0
Reputation: 70538
The a.out is in a format the loader of the OS you are using can understand. Those different texts you see are markers for different parts of the 0s and 1s you expect.
The ?
and ` show spots where there are binary unprintable data.
Upvotes: 1
Reputation: 308500
Characters are also made of 0's and 1's, and the computer has no way of knowing the difference. You asked it to show the file and it did.
In addition to the machine instructions, the binary file also contains layout and optional debug information which can be readable strings.
Upvotes: 1