Reputation: 17553
I have recently encountered a very strange issue that might be due to the kernel memory allocator. At first, I suspected some type of memory bug in my C++ code but the exact behavior I am seeing leads me to believe that perhaps it is not due to a bug in the code. It's quite strange, but here's my best description of the problem.
I have an application that writes and overwrites files in the /dev/shm area of my machine. At the beginning of the program, it declares file pointers for all of the files it is going to write and continuously overwrite. These pointers all created at the start of the program.
When I run the code, I notice the following. First memory usage jumps up to 4.3% of my system total (looking under top). This happens right when I launch the executable. Then, the CPU usage hovers around 40-50% before the code even starts doing anything. After about 2-3 minutes of this, the memory usage then goes to 5.0% and there are no further increases. At the time this happens, the CPU usage falls to 5-15% which is the range the program usually runs at (due to the rate that data is being passed to it).
Something is happening behind the scenes during my program's startup with the memory but I can't understand what it is, it feels like it shouldn't take 2-3 minutes to allocate 5% of system memory (1.2GB) on a modern x86_64 server. Note that after this strange startup, the program usually runs without issue.
However, today, I had to increase the number of files the program is writing to in /dev/shm and accordingly, the number of pointers as well. And here is where the trouble is, during the startup procedure, the CPU usage suddenly jumps to 100% and stays there. This is a huge problem because it leads to a massive slowdown of my application, below acceptable levels. The only difference between this and the working executable is the number of files I am having it write. To give specifics, I increased the number of files from 1345 to 1350. In fact, just one over 1346 is sufficient to kick off this 100% cpu issue.
I'm really at a loss about what I am dealing with here. I'm suspecting perhaps something with SLAB/SLUB allocator (my system is Centos 5.8 with 2.6.35 kernel). Any ideas or hints about how to resolve this will be much appreciated.
Upvotes: 0
Views: 822
Reputation: 2720
I think it's unlikely to be a problem with SLUB. /dev/shm
is implemented via tmpfs
(on modern systems), which uses the page cache, not SLUB.
You need to work out what your program is doing when it's chewing CPU. You could start with strace
, that will at least show you if your program is spending lots of time in the kernel, or in your code. From there you should learn to use perf
.
Upvotes: 2