Reputation: 43
I am trying to learn nasm. I want to make a program that prints "Hello, world." n times (in this case 10). I am trying to save the loop register value in a constant so that it is not changed when the body of the loop is executed. When I try to do this I receive a segmentation fault error. I am not sure why this is happening.
My code:
SECTION .DATA
print_str: db 'Hello, world.', 10
print_str_len: equ $-print_str
limit: equ 10
step: dw 1
SECTION .TEXT
GLOBAL _start
_start:
mov eax, 4 ; 'write' system call = 4
mov ebx, 1 ; file descriptor 1 = STDOUT
mov ecx, print_str ; string to write
mov edx, print_str_len ; length of string to write
int 80h ; call the kernel
mov eax, [step] ; moves the step value to eax
inc eax ; Increment
mov [step], eax ; moves the eax value to step
cmp eax, limit ; Compare sil to the limit
jle _start ; Loop while less or equal
exit:
mov eax, 1 ; 'exit' system call
mov ebx, 0 ; exit with error code 0
int 80h ; call the kernel
The result:
Hello, world.
Segmentation fault (core dumped)
The cmd:
nasm -f elf64 file.asm -o file.o
ld file.o -o file
./file
Upvotes: 1
Views: 848
Reputation: 364190
section .DATA
is the direct cause of the crash. Lower-case section .data
is special, and linked as a read-write (private) mapping of the executable. Section names are case-sensitive.
Upper-case .DATA
is not special for nasm or the linker, and it ends up as part of the text segment mapped read+exec without write permission.
Upper-case .TEXT
is also weird: by default objdump -drwC -Mintel
only disassembles the .text
section (to avoid disassembling data as if it were code), so it shows empty output for your executable.
On newer systems, the default for a section name NASM doesn't recognize doesn't include exec permission, so code in .TEXT
will segfault. Same as Assembly section .code and .text behave differently
After starting the program under GDB (gdb ./foo
, starti
), I looked at the process's memory map from another shell.
$ cat /proc/11343/maps
00400000-00401000 r-xp 00000000 00:31 110651257 /tmp/foo
7ffff7ffa000-7ffff7ffd000 r--p 00000000 00:00 0 [vvar]
7ffff7ffd000-7ffff7fff000 r-xp 00000000 00:00 0 [vdso]
7ffffffde000-7ffffffff000 rwxp 00000000 00:00 0 [stack]
As you can see, other than the special VDSO mappings and the stack, there's only the one file-backed mapping, and it has read+exec permission only.
Single-stepping inside GDB, the mov eax,DWORD PTR ds:0x400086
load succeeds, but the mov DWORD PTR ds:0x400086,eax
store faults. (See the bottom of the x86 tag wiki for GDB asm tips.)
From readelf -a foo
, we can see the ELF program headers that tell the OS's program loader how to map it into memory:
$ readelf -a foo # broken version
...
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000000000bf 0x00000000000000bf R 0x200000
Section to Segment mapping:
Segment Sections...
00 .DATA .TEXT
Notice how both .DATA
and .TEXT
are in the same segment. This is what you'd want for section .rodata
(a standard section name where you should put read-only constant data like your string), but it won't work for mutable global variables.
After fixing your asm to use section .data
and .text
, readelf shows us:
$ readelf -a foo # fixed version
...
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000000000e7 0x00000000000000e7 R E 0x200000
LOAD 0x00000000000000e8 0x00000000006000e8 0x00000000006000e8
0x0000000000000010 0x0000000000000010 RW 0x200000
Section to Segment mapping:
Segment Sections...
00 .text
01 .data
Notice how segment 00 is R + E without W, and the .text
section is in there. Segment 01 is RW (read + write) without exec, and the .data
section is there.
The LOAD
tag means they're mapped into the process's virtual address space. Some section (like debug info) aren't, and are just metadata for other tools. But NASM flags unknown section names as progbits, i.e. loaded, which is why it was able to link and have the load not segfault.
After fixing it to use section .data
, your program runs without segfaulting.
The loop runs for one iteration, because the 2 bytes following step: dw 1
are not zero. After the dword load, RAX = 0x2c0001
on my system. (cmp
between 0x002c0002 and 0xa
makes the LE condition false because it's not less or equal.)
dw
means "data word" or "define word". Use dd
for a data dword.
BTW, there's no need to keep your loop counter in memory. You're not using RDI, RSI, RBP, or R8..R15 for anything so you could just keep it in a register. Like mov edi, limit
before the loop, and dec edi / jnz
at the bottom.
But actually you should use the 64-bit syscall
ABI if you want to build 64-bit code, not the 32-bit int 0x80
ABI. What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?. Or build 32-bit executables if you're following a guide or tutorial written for that.
Anyway, in that case you'd be able to use ebx
as your loop counter, because the syscall ABI uses different args for registers.
Upvotes: 4