Reputation: 14485
I want to create a program that will simulate an out-of-memory (OOM) situation on a Unix server. I created this super-simple memory eater:
#include <stdio.h>
#include <stdlib.h>
unsigned long long memory_to_eat = 1024 * 50000;
size_t eaten_memory = 0;
void *memory = NULL;
int eat_kilobyte()
{
memory = realloc(memory, (eaten_memory * 1024) + 1024);
if (memory == NULL)
{
// realloc failed here - we probably can't allocate more memory for whatever reason
return 1;
}
else
{
eaten_memory++;
return 0;
}
}
int main(int argc, char **argv)
{
printf("I will try to eat %i kb of ram\n", memory_to_eat);
int megabyte = 0;
while (memory_to_eat > 0)
{
memory_to_eat--;
if (eat_kilobyte())
{
printf("Failed to allocate more memory! Stucked at %i kb :(\n", eaten_memory);
return 200;
}
if (megabyte++ >= 1024)
{
printf("Eaten 1 MB of ram\n");
megabyte = 0;
}
}
printf("Successfully eaten requested memory!\n");
free(memory);
return 0;
}
It eats as much memory as defined in memory_to_eat
which now is exactly 50 GB of RAM. It allocates memory by 1 MB and prints exactly the point where it fails to allocate more, so that I know which maximum value it managed to eat.
The problem is that it works. Even on a system with 1 GB of physical memory.
When I check top I see that the process eats 50 GB of virtual memory and only less than 1 MB of resident memory. Is there a way to create a memory eater that really does consume it?
System specifications: Linux kernel 3.16 (Debian) most likely with overcommit enabled (not sure how to check it out) with no swap and virtualized.
Upvotes: 152
Views: 11348
Reputation: 20731
As mentioned by others, the allocation of memory, until used, does not always commit the necessary RAM. This happens if you allocate a buffer larger than one page (usually 4Kb on Linux).
One simple answer would be for your "eat memory" function to always allocate 1Kb instead of increasingly larger blocks. This is because each allocated blocks start with a header (a size for allocated blocks). So allocating a buffer of a size equal to or less than one page will always commit all of those pages.
To optimize your code as much as possible, you want to allocate blocks of memory aligned to 1 page size.
From what I can see in your code, you use 1024. I would suggest that you use:
int size;
size = getpagesize();
block_size = size - sizeof(void *) * 2;
What voodoo magic is this sizeof(void *) * 2
?! When using the default memory allocation library (i.e. not SAN, fence, valgrin, ...), there is a small header just before the pointer returned by malloc()
which includes a pointer to the next block and a size.
struct mem_header { void * next_block; intptr_t size; };
Now, using block_size
, all your malloc()
should be aligned to the page size we found earlier.
If you want to properly align everything, the first allocation needs to use an aligned allocation:
char *p = NULL;
int posix_memalign(&p, size, block_size);
Further allocations (assuming your tool only does that) can use malloc()
. They will be aligned.
p = malloc(block_size);
Note: please verify that it is indeed aligned on your system... it works on mine.
As a result you can simplify your loop with:
for(;;)
{
p = malloc(block_size);
*p = 1;
}
Until you create a thread, the malloc()
does not use mutexes. But it still has to look for a free memory block. In your case, though, it will be one after the other and there will be no holes in the allocated memory so it will be pretty fast.
Further note about how memory is generally allocated in a Unix system:
the malloc()
function and related functions will allocate a block in your heap; which at the start is pretty small (maybe 2Mb)
when the existing heap is full it gets grown using the sbrk()
function; as far as your process is concerned, the memory address always increases, that's what sbrk()
does (contrary to MS-Windows which allocates blocks all over the place)
using sbrk()
once and then hitting the memory every "page size" bytes would be faster than using malloc()
char * p = malloc(size); // get current "highest address"
p += size;
p = (char*)((intptr_t)p & -size); // clear bits (alignment)
int total_mem(50 * 1024 * 1024 * 1024); // 50Gb
void * start(sbrk(total_mem));
char * end((char *)start + total_mem);
for(; p < end; p += size)
{
*p = 1;
}
note that the malloc()
above may give you the "wrong" start address. But your process really doesn't do much, so I think you'll always be safe. That for()
loop, however, is going to be as fast as possible. As mentioned by others, you'll get the total_mem
of virtual memory allocated "instantly" and then the RSS memory allocated each time you write to *p
.
WARNING: Code not tested, use at your own risk.
Upvotes: 0
Reputation: 28892
Not sure about this one but the only explanation that I can things of is that linux is a copy-on-write operating system. When one calls fork
the both processes point to the same physically memory. The memory is only copied once one process actually WRITES to the memory.
I think here, the actual physical memory is only allocated when one tries to write something to it. Calling sbrk
or mmap
may well only update the kernel's memory book-keep. The actual RAM may only be allocated when we actually try to access the memory.
Upvotes: 6
Reputation: 7352
All the virtual pages start out copy-on-write mapped to the same zeroed physical page. To use up physical pages, you can dirty them by writing something to each virtual page.
If running as root, you can use mlock(2)
or mlockall(2)
to have the kernel wire up the pages when they're allocated, without having to dirty them. (normal non-root users have a ulimit -l
of only 64kiB.)
As many others suggested, it seems that the Linux kernel doesn't really allocate the memory unless you write to it
This also fixes the printf format string mismatches with the types of memory_to_eat and eaten_memory, using %zi
to print size_t
integers. The memory size to eat, in kiB, can optionally be specified as a command line arg.
The messy design using global variables, and growing by 1k instead of 4k pages, is unchanged.
#include <stdio.h>
#include <stdlib.h>
size_t memory_to_eat = 1024 * 50000;
size_t eaten_memory = 0;
char *memory = NULL;
void write_kilobyte(char *pointer, size_t offset)
{
int size = 0;
while (size < 1024)
{ // writing one byte per page is enough, this is overkill
pointer[offset + (size_t) size++] = 1;
}
}
int eat_kilobyte()
{
if (memory == NULL)
{
memory = malloc(1024);
} else
{
memory = realloc(memory, (eaten_memory * 1024) + 1024);
}
if (memory == NULL)
{
return 1;
}
else
{
write_kilobyte(memory, eaten_memory * 1024);
eaten_memory++;
return 0;
}
}
int main(int argc, char **argv)
{
if (argc >= 2)
memory_to_eat = atoll(argv[1]);
printf("I will try to eat %zi kb of ram\n", memory_to_eat);
int megabyte = 0;
int megabytes = 0;
while (memory_to_eat-- > 0)
{
if (eat_kilobyte())
{
printf("Failed to allocate more memory at %zi kb :(\n", eaten_memory);
return 200;
}
if (megabyte++ >= 1024)
{
megabytes++;
printf("Eaten %i MB of ram\n", megabytes);
megabyte = 0;
}
}
printf("Successfully eaten requested memory!\n");
free(memory);
return 0;
}
Upvotes: 29
Reputation: 40625
When your malloc()
implementation requests memory from the system kernel (via an sbrk()
or mmap()
system call), the kernel only makes a note that you have requested the memory and where it is to be placed within your address space. It does not actually map those pages yet.
When the process subsequently accesses memory within the new region, the hardware recognizes a segmentation fault and alerts the kernel to the condition. The kernel then looks up the page in its own data structures, and finds that you should have a zero page there, so it maps in a zero page (possibly first evicting a page from page-cache) and returns from the interrupt. Your process does not realize that any of this happened, the kernels operation is perfectly transparent (except for the short delay while the kernel does its work).
This optimization allows the system call to return very quickly, and, most importantly, it avoids any resources to be committed to your process when the mapping is made. This allows processes to reserve rather large buffers that they never need under normal circumstances, without fear of gobbling up too much memory.
So, if you want to program a memory eater, you absolutely have to actually do something with the memory you allocate. For this, you only need to add a single line to your code:
int eat_kilobyte()
{
if (memory == NULL)
memory = malloc(1024);
else
memory = realloc(memory, (eaten_memory * 1024) + 1024);
if (memory == NULL)
{
return 1;
}
else
{
//Force the kernel to map the containing memory page.
((char*)memory)[1024*eaten_memory] = 42;
eaten_memory++;
return 0;
}
}
Note that it is perfectly sufficient to write to a single byte within each page (which contains 4096 bytes on X86). That's because all memory allocation from the kernel to a process is done at memory page granularity, which is, in turn, because of the hardware that does not allow paging at smaller granularities.
Upvotes: 224
Reputation: 234715
A sensible optimisation is being made here. The runtime does not actually acquire the memory until you use it.
A simple memcpy
will be sufficient to circumvent this optimisation. (You might find that calloc
still optimises out the memory allocation until the point of use.)
Upvotes: 13