Reputation: 115
I know kernel takes care of mapping the virtual memory to real memory. But I want to know who actually creates the virtual memory for a process as shown in /proc/pid/maps file.
1)Is it the compiler/linker create a virtual memory region for the process and kernel just maps it to the real memory(Because virtual memory region doesn't matter and all it matters it the mapping)?
2)Or does the kernel itself create a virtual memory space while forking a process and it maps it to the real memory?
Finally what does mmap system call do (1) or (2)?
Upvotes: 1
Views: 990
Reputation: 2035
The kernel is the entity that actually creates and manages the virtual memory regions that you see in /proc/pid/maps. Within the structure holding the state of each process (struct task_struct
) there is a struct mm_struct
(look in linux/sched.h), and in particular, within that is struct vm_area_struct * mmap
. This is a list maintained by the kernel of all the memory regions (called region descriptors) mapped into the process address space. When mmap
is invoked, a new element is added on this list, and will subsequently show up in /proc/pid/maps.
Note that most of the file-backed regions, e.g. libc.so, listed in /proc/pid/maps are mapped there by the code in the dynamic linker (ld.so) at process startup.
Also note that the kernel will not create virtual-physical mappings for the addresses in these regions until absolutely necessary.
Hope this helps
Upvotes: 1
Reputation: 8153
Both your assertions are actually correct (to some extent).
In the case of an executable ELF file, the linker relies on a linker script to assign an address in the virtual space to every symbol of the program (these are grouped into sections which all have a start address and size). You can see the default script that is used by invoking ld --verbose
. The sections of the binary and their addresses can be seen using tools like readelf
or objdump
, e.g. readelf -l /bin/cat
. Then if you run cat /proc/self/maps
you should that the addresses at which /bin/cat
is mapped do match. So the execve
kernel system call does that: replace the current process' address space with a new one to which the executable file given as argument is mapped.
Of course if every bit of code was assigned a static address you would meet problems with shared libraries. Shared libraries use position-independant code, so they can be mapped about everywhere in the process address space. Here the kernel takes the decision as to how to proceed.
mmap
does neither of (1) or (2), it just maps a memory or part of a file at a given address of the address space (or let the kernel decide which address to use). Actually it is used to map the shared libraries that a program uses. To see how, run strace /bin/true
and see how execve
is called first to create the process' address space from the binary file, and how the libc file is opened and relevant sections mmap'ed with the right permissions by the program loader:
execve("/bin/true", ["/bin/true"], [/* 69 vars */]) = 0
...
open("/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
mmap(NULL, 3804080, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f224f351000
mprotect(0x7f224f4e8000, 2097152, PROT_NONE) = 0
mmap(0x7f224f6e8000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x197000) = 0x7f224f6e8000
The following articles may also be worth reading:
Upvotes: 3