Reputation: 7883
By looking at binfmt_elf.c in the kernel source, I have not been able to figure out what the kernel (64 bit) does differently when spawning a 32 bit process vs a 64 bit process.
Can anybody explain to me what I am missing?
(This question is related to my other question about having 32 bit instructions in the same process as 64 bit instructions (link), but this qualifies as a separate question.)
Upvotes: 2
Views: 1095
Reputation: 7883
If the execveat system call is used to start a new process, we first enter fs/exec.c in the kernel source into the SYSCALL_DEFINEx(execveat..) function. This one then calls these functions:
The search_binary_handler iterates over the various binary handlers. In a 64 bit Linux kernel, there will be one handler for 64 bit ELFs and one for 32 bit ELFs. Both handlers are ultimately built from the same source fs/binfmt_elf.c. However, the 32 bit handler is built via fs/compat_binfmt_elf.c which redefines a number of macros before including the source file binfmt_elf.c itself.
Inside binfmt_elf.c, elf_check_arch is called. This is a macro defined in arch/x86/include/asm/elf.h and defined differently in the 64 bit handler vs the 32 bit handler. For 64 bit, it compares with EM_X86_64 ( 62 - defined in include/uapi/ilnux/elf-em.h). For 32 bit, it compares with EM_386 (3) or EM_486 (6) (defined in the same file). If the comparison fails, the binary handler gives up, so we end up with only one of the handlers taking care of the ELF parsing and execution - depending on whether the ELF is 64 bit or 32 bit.
All differences on parsing 32 bit ELFs vs 64 bit ELFs in 64 bit Linux should therefore be found in the file fs/compat_binfmt_elf.c.
The main clue seems to be compat_start_thread. start_thread is redefined to compat_start_thread. This function definition is found in arch/x86/kernel/process_64.c. compat_start_thread then calls start_thread_common with these arguments:
start_thread_common(regs, new_ip, new_sp,
test_thread_flag(TIF_X32)
? __USER_CS : __USER32_CS,
__USER_DS, __USER_DS);
while the normal start_thread function calls start_thread_common with these arguments:
start_thread_common(regs, new_ip, new_sp,
__USER_CS, __USER_DS, 0);
Here we already see the architecture dependent code doing something with CS differently for 64 bit ELFs vs 32 bit ELFs.
Then we have the definitions for __USER_CS and __USER32_CS in arch/x86/include/asm/segment.h:
#define __USER_CS (GDT_ENTRY_DEFAULT_USER_CS*8 + 3)
#define __USER32_CS (GDT_ENTRY_DEFAULT_USER32_CS*8 + 3)
and:
#define GDT_ENTRY_DEFAULT_USER_CS 6
#define GDT_ENTRY_DEFAULT_USER32_CS 4
So __USER_CS
is 6*8 + 3 = 51 = 0x33
And __USER32_CS
is 4*8 + 3 = 35 = 0x23
These numbers match what is used for CS in these examples:
Since the CPU is not running in real mode, the segment register is not filled with the segment itself, but a 16-bit selector:
From Wikipedia (Protected mode):
In protected mode, the segment_part is replaced by a 16-bit selector, in which the 13 upper bits (bit 3 to bit 15) contain the index of an entry inside a descriptor table. The next bit (bit 2) specifies whether the operation is used with the GDT or the LDT. The lowest two bits (bit 1 and bit 0) of the selector are combined to define the privilege of the request, where the values of 0 and 3 represent the highest and the lowest privilege, respectively.
With the CS value 0x23, bit 1 and 0 is 3, meaning "lowest privilege". Bit 2 is 0, meaning GDT, and bit 3 to bit 15 is 4, meaning we get index 4 from the global descriptor table (GDT).
This is how far I have been able to dig so far.
Upvotes: 8