Thunderman
Thunderman

Reputation: 115

Who creates the virtual memory in Linux?

I know kernel takes care of mapping the virtual memory to real memory. But I want to know who actually creates the virtual memory for a process as shown in /proc/pid/maps file.

1)Is it the compiler/linker create a virtual memory region for the process and kernel just maps it to the real memory(Because virtual memory region doesn't matter and all it matters it the mapping)?

2)Or does the kernel itself create a virtual memory space while forking a process and it maps it to the real memory?

Finally what does mmap system call do (1) or (2)?

Upvotes: 1

Views: 990

Answers (2)

kch
kch

Reputation: 2035

The kernel is the entity that actually creates and manages the virtual memory regions that you see in /proc/pid/maps. Within the structure holding the state of each process (struct task_struct) there is a struct mm_struct (look in linux/sched.h), and in particular, within that is struct vm_area_struct * mmap. This is a list maintained by the kernel of all the memory regions (called region descriptors) mapped into the process address space. When mmap is invoked, a new element is added on this list, and will subsequently show up in /proc/pid/maps.

Note that most of the file-backed regions, e.g. libc.so, listed in /proc/pid/maps are mapped there by the code in the dynamic linker (ld.so) at process startup.

Also note that the kernel will not create virtual-physical mappings for the addresses in these regions until absolutely necessary.

Hope this helps

Upvotes: 1

Gnurou
Gnurou

Reputation: 8153

Both your assertions are actually correct (to some extent).

In the case of an executable ELF file, the linker relies on a linker script to assign an address in the virtual space to every symbol of the program (these are grouped into sections which all have a start address and size). You can see the default script that is used by invoking ld --verbose. The sections of the binary and their addresses can be seen using tools like readelf or objdump, e.g. readelf -l /bin/cat. Then if you run cat /proc/self/maps you should that the addresses at which /bin/cat is mapped do match. So the execve kernel system call does that: replace the current process' address space with a new one to which the executable file given as argument is mapped.

Of course if every bit of code was assigned a static address you would meet problems with shared libraries. Shared libraries use position-independant code, so they can be mapped about everywhere in the process address space. Here the kernel takes the decision as to how to proceed.

mmap does neither of (1) or (2), it just maps a memory or part of a file at a given address of the address space (or let the kernel decide which address to use). Actually it is used to map the shared libraries that a program uses. To see how, run strace /bin/true and see how execve is called first to create the process' address space from the binary file, and how the libc file is opened and relevant sections mmap'ed with the right permissions by the program loader:

execve("/bin/true", ["/bin/true"], [/* 69 vars */]) = 0
...
open("/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
mmap(NULL, 3804080, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f224f351000
mprotect(0x7f224f4e8000, 2097152, PROT_NONE) = 0
mmap(0x7f224f6e8000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x197000) = 0x7f224f6e8000

The following articles may also be worth reading:

Upvotes: 3

Related Questions