segmentation fault while calling functions in nasm assembly

Question

I am trying to call my own functions in nasm, it works 2 times then it give segmentation fault error. I created two functions display1 and display2 which will display "This is message1" and "This is message2" respectively. These functions correctly for the first time but then shows segmentation fault on calling these functions twice.

global _start

section .text

display1:
    mov eax, 0x4
    mov ebx, 0x1
    mov ecx, var1
    mov edx, len1
    int 0x80
    ret
display2:
    mov eax, 0x4
    mov ebx, 0x1
    mov ecx, var2
    mov edx, var2
    int 0x80
    ret

_start:
    call display1
    call display2
    call display1
    call display2
    mov eax, 0x1
    mov ebx, 0x5
    int 0x80

section .data

    var1: db "This is message1", 0x0A, 0x00
    len1 equ $-var1
    var2: db "This is message2", 0x0A, 0x00
    len2 equ $-var2

This is message1
This is message2
.symtab.strtab.shstrtab.text.data�N!�$�'
                                        �
    �U�����"'��,1��6�=���I���P���functions.nasmdisplay1display2var1len1var2len2_start__bss_start_edata_endThis is message1
This is message2
.symtab.strtab.shstrtab.text.data�N!�$�'
                                        �
    �U�����"'��,1��6�=���I���P���functions.nasmdisplay1display2var1len1var2len2_start__bss_start_edata_endSegmentation fault (core dumped)

Peter Cordes · Accepted Answer

Congratulations, you've found a kernel bug (in your very old Ubuntu 12.04 / Linux 3.13.0-32-generic 32-bit kernel).

mov edx, var2 passes a very large integer (an address) as the size. This is why you get garbage after the 2nd message; the write system call is reading memory up to somewhere near an unmapped page and then stopping.

On a non-buggy kernel, then write returns and execution continues until the _exit system call like you'd expect.

The instruction int 0x80 is causing the segmentation fault.

IDK whether that's more or less insane than corrupting user-space and leading to a fault later.

It's probably not worth reporting this kernel bug anywhere. Ubuntu 12.04 LTS reached End of Life in 2017. The bug doesn't exist in modern kernels and was probably either noticed or fixed by accident as part of some other change in the 7 years since that kernel was current.

What happens in non-buggy kernels with write() that eventually reads from unmapped pages

The write(2) man page definitely does not document the possibility of raising a signal on bad args, only of error codes like EFAULT.

I can't reproduce the segfault on Arch Linux with x86-64 Linux kernel 5.0.1; I get the expected garbage written and then write(2) returns the number of bytes written before it hit an unmapped page. Then execution continues until the _exit(5) system call and the process exits cleanly with status=5.

I thought write might return -EFAULT even after writing some bytes when you pass a pointer+size that includes unmapped pages, but that's not the case. The wording in the man page doesn't mention this specific case, but the wording of how other errors detected part way through writing are handled is consistent with this. (Normally those errors are from things like disk becoming full, or maybe the other side of the pipe closing.)

write(2) Linux man page

Note that a successful write() may transfer fewer than count bytes. Such partial writes can occur for various reasons; ...
...
In the event of a partial write, the caller can make another write() call to transfer the remaining bytes. The subsequent call will either transfer further bytes or may result in an error (e.g., if the disk is now full).

Linux definitely does not always transfer all the way to the end of the last mapped page when you do this. But it's interesting to see what happens for different cases.

It seems that it copies in chunks, and checks readability of each chunk as it goes. When a chunk would read from an unmapped page, the error is detected and it returns with a partial write. If you made another call with address = buf + first_retval, you'd probably get a -EFAULT. So it's very much like filling up the disk with a partial write and then detecting it by getting -ENOSPC when trying to write the rest.

Redirecting output to a file (in tmpfs) on x86-64 Linux 5.0.1 I get write() sizes of 4078. 4096-18 = 4078, and I'm using a recent ld (Binutils 2.32) so the .data section is 4k-aligned in the executable, and the start of the section is also page-aligned in memory. So the end of a page is at var2 + 4096 - len1.

$ strace ./2write > foo
strace: [ Process PID=28961 runs in 32 bit mode. ]
write(1, "This is message1
\0", 18)    = 18
write(1, "This is message2
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 134520850) = 4078
write(1, "This is message1
\0", 18)    = 18
write(1, "This is message2
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 134520850) = 4078
exit(5)                                 = ?
+++ exited with 5 +++

vs. writing to the terminal, I get a size of 2048

vs. writing to /dev/null, I get success with write returning 134520850. The driver for the null special block device doesn't even read the user-space memory, it just returns success from write system calls that make it that far. So nothing ever checks for -EFAULT.

Piping the output to wc, I got a surprising 18-byte partial-write on the first bad call, and -EFAULT on the next.

strace ./2write | wc
execve("./2write", ["./2write"], 0x7ffdba771cf0 /* 53 vars */) = 0
strace: [ Process PID=29008 runs in 32 bit mode. ]
write(1, "This is message1
\0", 18)    = 18
write(1, "This is message2
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 134520850) = 18
write(1, "This is message1
\0", 18)    = 18
write(1, "This is message2
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 134520850) = -1 EFAULT (Bad address)
exit(5)                                 = ?
+++ exited with 5 +++
      3       9      54

On subsequent runs of the program, I got -EFAULT right away. I'm guessing that Linux may have allocated more memory for a pipe buffer after the first call, so then it was able to look far enough ahead to notice the bad address right away, before copying any data.

peter@volta:/tmp$ strace ./2write | wc
execve("./2write", ["./2write"], 0x7fff868a41b0 /* 53 vars */) = 0
strace: [ Process PID=29015 runs in 32 bit mode. ]
write(1, "This is message1
\0", 18)    = 18
write(1, "This is message2
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 134520850) = -1 EFAULT (Bad address)
write(1, "This is message1
\0", 18)    = 18
write(1, "This is message2
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 134520850) = -1 EFAULT (Bad address)
exit(5)                                 = ?
      2       6      36

segmentation fault while calling functions in nasm assembly

Answers (1)

What happens in non-buggy kernels with write() that eventually reads from unmapped pages

Related Questions