Reputation: 3235
I have the following code to check process' segments boundaries:
#include <sys/types.h>
#include <sys/stat.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <stdbool.h>
#include <fcntl.h>
extern char edata, etext, end;
void dump(void)
{
printf("etext=%p, edata=%p, end=%p, sbrk(0)=%p\n", &etext, &edata, &end, sbrk(0));
}
int main(void)
{
dump();
char *p = (char*)malloc(1);
printf("allocated 1 byte\n");
dump();
free(p);
p = (char*)malloc(1024);
printf("allocated 1024 bytes\n");
dump();
free(p);
return 0;
}
Here is output:
>./a.out
etext=0x557f00098305, edata=0x557f0009b010, end=0x557f0009b018, sbrk(0)=0x557f00b0f000
allocated 1 byte
etext=0x557f00098305, edata=0x557f0009b010, end=0x557f0009b018, sbrk(0)=0x557f00b30000
allocated 1024 bytes
etext=0x557f00098305, edata=0x557f0009b010, end=0x557f0009b018, sbrk(0)=0x557f00b30000
I am perplexed at the output: whether allocate 1 byte or 1024 bytes, sbrk(0) returns same value at 0x557f00b30000, I can interpret this as that memory allocation is by page, not by exact request, this is fine, but why this size? The difference between 0x557f00b0f000 and 0x557f00b30000 is 135,168 which is 132K, this value looks a bit odd to me, is this the page size? How can I find it out?
Upvotes: 3
Views: 642
Reputation: 5201
The dynamic memory allocator of the C library is managing the so-called heap but actually, this does not mean it only plays with the heap. Moreover, service functions like printf()
use the dynamic memory allocator as well. So, you have hidden calls to malloc()
.
For performance purposes especially in multithreaded environments, in the GNU C library (GLIBC), a malloc()
call can hide an allocation not only in the heap through sbrk()
but also in some mapped memory space (also called arenas) through mmap()
. Moreover, if the allocation is huge, this can also result into a single memory area. Internally, there are some complex heuristics to decide where the allocated blocks will be to meet the performance requirements and avoid memory fragmentation.
In other words, the heap will not grow on user's demand. The allocator (pre)allocates memory spaces internally in the heap and memory mapped regions to satisfy the user requests as efficiently as possible (minimizing as much as possible time consuming system calls like brk()
or mmap()
and mutex contention).
Let's modify slightly your program to add a big allocation request (without freeing it) and a pause before exiting the program in order to be able to look at its memory map. We also added the display of the memory address (%p in the printf()
format).
#include <sys/types.h>
#include <sys/stat.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <stdbool.h>
#include <fcntl.h>
extern char edata, etext, end;
void dump(void)
{
printf("etext=%p, edata=%p, end=%p, sbrk(0)=%p\n", &etext, &edata, &end, sbrk(0));
}
int main(void)
{
dump();
char *p = (char*)malloc(1);
printf("allocated 1 byte @ %p\n", p);
dump();
free(p);
p = (char*)malloc(1024);
printf("allocated 1024 bytes @ %p\n", p);
dump();
free(p);
p = (char *)malloc(1024*1024);
printf("allocated 1024 bytes @ %p\n", p);
dump();
pause();
return 0;
}
Compilation and execution:
$ ./a.out
etext=0x5609281a58dd, edata=0x5609283a6010, end=0x5609283a6018, sbrk(0)=0x560929d9b000
allocated 1 byte @ 0x560929d9b670
etext=0x5609281a58dd, edata=0x5609283a6010, end=0x5609283a6018, sbrk(0)=0x560929dbc000
allocated 1024 bytes @ 0x560929d9b690
etext=0x5609281a58dd, edata=0x5609283a6010, end=0x5609283a6018, sbrk(0)=0x560929dbc000
allocated 1024 bytes @ 0x7f28f645b010
etext=0x5609281a58dd, edata=0x5609283a6010, end=0x5609283a6018, sbrk(0)=0x560929dbc000
In another terminal, display the memory map of the running process which is suspended on the pause()
call:
$ cat /proc/`pidof a.out`/smaps
[...]
560929d9b000-560929dbc000 rw-p 00000000 00:00 0 [heap]
Size: 132 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Rss: 4 kB
Pss: 4 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 4 kB
Referenced: 4 kB
Anonymous: 4 kB
LazyFree: 0 kB
AnonHugePages: 0 kB
ShmemPmdMapped: 0 kB
FilePmdMapped: 0 kB
Shared_Hugetlb: 0 kB
Private_Hugetlb: 0 kB
Swap: 0 kB
SwapPss: 0 kB
Locked: 0 kB
THPeligible: 0
VmFlags: rd wr mr mw me ac sd
[...]
7f28f645b000-7f28f655e000 rw-p 00000000 00:00 0
Size: 1036 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Rss: 12 kB
Pss: 12 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 12 kB
Referenced: 12 kB
Anonymous: 12 kB
LazyFree: 0 kB
AnonHugePages: 0 kB
ShmemPmdMapped: 0 kB
FilePmdMapped: 0 kB
Shared_Hugetlb: 0 kB
Private_Hugetlb: 0 kB
Swap: 0 kB
SwapPss: 0 kB
Locked: 0 kB
THPeligible: 0
VmFlags: rd wr mr mw me ac sd
[...]
The first two malloc()
calls return allocated blocks at 0x560929d9b670 and 0x560929d9b690 in the heap area beginning at address 0x560929d9b000 but the third malloc()
call allocated the memory in a separate memory area at 0x7f28f645b010.
Let's look at the program execution through strace
to catch the involved system calls:
$ strace ./a.out
execve("./a.out", ["./a.out"], 0x7ffdfc67d420 /* 47 vars */) = 0
brk(NULL) = 0x56271fac4000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=116079, ...}) = 0
mmap(NULL, 116079, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fb47b852000
close(3) = 0
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240\35\2\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=2030928, ...}) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb47b850000
mmap(NULL, 4131552, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fb47b255000
mprotect(0x7fb47b43c000, 2097152, PROT_NONE) = 0
mmap(0x7fb47b63c000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1e7000) = 0x7fb47b63c000
mmap(0x7fb47b642000, 15072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fb47b642000
close(3) = 0
arch_prctl(ARCH_SET_FS, 0x7fb47b8514c0) = 0
mprotect(0x7fb47b63c000, 16384, PROT_READ) = 0
mprotect(0x56271dfab000, 4096, PROT_READ) = 0
mprotect(0x7fb47b86f000, 4096, PROT_READ) = 0
munmap(0x7fb47b852000, 116079) = 0
brk(NULL) = 0x56271fac4000
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0
brk(0x56271fae5000) = 0x56271fae5000
write(1, "etext=0x56271ddab8dd, edata=0x56"..., 87etext=0x56271ddab8dd, edata=0x56271dfac010, end=0x56271dfac018, sbrk(0)=0x56271fac4000
) = 87
write(1, "allocated 1 byte @ 0x56271fac467"..., 34allocated 1 byte @ 0x56271fac4670
) = 34
write(1, "etext=0x56271ddab8dd, edata=0x56"..., 87etext=0x56271ddab8dd, edata=0x56271dfac010, end=0x56271dfac018, sbrk(0)=0x56271fae5000
) = 87
write(1, "allocated 1024 bytes @ 0x56271fa"..., 38allocated 1024 bytes @ 0x56271fac4690
) = 38
write(1, "etext=0x56271ddab8dd, edata=0x56"..., 87etext=0x56271ddab8dd, edata=0x56271dfac010, end=0x56271dfac018, sbrk(0)=0x56271fae5000
) = 87
mmap(NULL, 1052672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb47b74f000
write(1, "allocated 1024 bytes @ 0x7fb47b7"..., 38allocated 1024 bytes @ 0x7fb47b74f010
) = 38
write(1, "etext=0x56271ddab8dd, edata=0x56"..., 87etext=0x56271ddab8dd, edata=0x56271dfac010, end=0x56271dfac018, sbrk(0)=0x56271fae5000
) = 87
pause(
The first call to sbrk(0)
in the program (first call to dump()) is translated into a brk(NULL)
system call by the C library:
brk(NULL) = 0x56271fac4000
At this point, the end of the heap is at 0x56271fac4000.
Before the first printf()
, we see another call to brk(0x56271fae5000)
corresponding to an internal malloc()
done by printf()
itself to manage the format buffer to be printed on the screen. This makes the heap grow with 135168 (i.e. 0x21000) more bytes:
brk(0x56271fae5000) = 0x56271fae5000
So, the first printf()
shows the result of the sbrk(NULL)
which is 0x56271fac4000 but the execution of printf()
made the heap grow up to 0x56271fae5000:
write(1, "etext=0x56271ddab8dd, edata=0x56"..., 87etext=0x56271ddab8dd, edata=0x56271dfac010, end=0x56271dfac018, sbrk(0)=0x56271fac4000
) = 87
Then, the program calls malloc(1)
which does not trigger any brk()
system call as the internal brk()
triggered by the preceding printf()
preallocated memory space. Hence, the result of malloc(1)
is 0x56271fac4670 and the following call to dump()
show the heap top address at the one discussed just before (0x56271fae5000):
write(1, "etext=0x56271ddab8dd, edata=0x56"..., 87etext=0x56271ddab8dd, edata=0x56271dfac010, end=0x56271dfac018, sbrk(0)=0x56271fae5000
) = 87
And it is the same for the following malloc(1024)
:
write(1, "etext=0x56271ddab8dd, edata=0x56"..., 87etext=0x56271ddab8dd, edata=0x56271dfac010, end=0x56271dfac018, sbrk(0)=0x56271fae5000
) = 87
For the third malloc(1024*1024)
, this is another story. The internals of the dynamic memory allocator decided that such a big allocation will not go in the heap as it may contibute to fragment it. So, an independant call to mmap()
is done to satisfy the request:
mmap(NULL, 1052672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb47b74f000
write(1, "allocated 1024 bytes @ 0x7fb47b7"..., 38allocated 1024 bytes @ 0x7fb47b74f010
) = 38
Hence, the returned address 0x7fb47b74f010 outside of the heap. Consequently, the heap is not impacted at all by this last call. Its top address stays the same:
write(1, "etext=0x56271ddab8dd, edata=0x56"..., 87etext=0x56271ddab8dd, edata=0x56271dfac010, end=0x56271dfac018, sbrk(0)=0x56271fae5000
) = 87
Upvotes: 3
Reputation: 5790
As previously stated, it depends on the implementation but to get a clue:
malloc (Notes section):
Normally, malloc() allocates memory from the heap, and adjusts the size of the heap as required, using
sbrk
.
mallopt - set memory allocation parameters:
M_TOP_PAD
- This parameter defines the amount of padding to employ when calling
sbrk
to modify the program break. (The measurement unit for this parameter is bytes.) This parameter has an effect in the following circumstances:
- When the program break is increased, then M_TOP_PAD bytes are added to the
sbrk
request.- the amount of padding is always rounded to a system page boundary
- The default value for this parameter is 128*1024
Upvotes: 1