lppier
lppier

Reputation: 1987

Stack memory not released

I have the following loop, which pops a C++ concurrent queue I have, from the implementation here. https://juanchopanzacpp.wordpress.com/2013/02/26/concurrent-queue-c11/

while (!interrupted)
{
    pxData data = queue->pop(); 
    if (data.value == -1)
    { 
        break; // exit loop on terminating condition
     }
    usleep(7000); // stub to simulate processing
}

I am looking at the memory history using System Monitor in CentOS7. I'm trying to free up the memory taken up by the queue, after reading the value from the queue. However, as the following while loop runs, I don't see the memory usage going down. I've verified that the queue length does go down.

It does go down, however, when -1 is encountered and the loop exits. (program is still running) But I can't have this, because where usleep is, I want to do some intensive processing.

Question: Why doesn't the memory occupied by data get free-ed? (according to System Monitor) Isn't the stack allocated memory supposed to be free-ed when the variable goes out of scope?

The struct is defined as follows, and populated at the beginning of the program.

typedef struct pxData
{
  float value; // -1 value terminates the loop
  float x, y, z;
  std::complex<float> valueData[65536];
} pxData;

It's populated with ~10000 pxData, which roughly translates to 5GB. System only has ~8GB. So it's important that the memory is free-ed up for doing other processing in the system.

Upvotes: 4

Views: 1609

Answers (2)

A C++ implementation would call some operator delete to release dynamically allocated (using some operator new) memory. In several C++ standard libraries, new calls malloc and delete calls free.

(I am focusing with a Linux point of view, but the principles are similar on other OSes)

But while malloc (or ::operator new) is sometimes asking the OS kernel some more memory by system calls changing the virtual address space like mmap(2), free (or ::operator delete) is often simply marking the released memory zone as re-available to future calls to malloc (or to new)

So from the kernel point of view (e.g. as seen thru /proc/, see proc(5)...), the virtual address space is not changing, and the memory remains consumed, even if inside the application it is marked as "freed" and will be reused at some future allocation (by future calls to malloc or new)

And most C++ standard containers are internally using heap data. In particular your local (stack-allocated) std::map or std::vector (or std::deque) variable will call new & delete for internal data.


BTW, I find quite strange your declaration. Unless every struct pxData has exactly 65536 used valueData slots, I would suggest to use some std::vector so have

  std::vector<std::complex<float>> valueData;

and improve your code accordingly. You'll probably need to do some valueData.reserve(somesize); and/or valueData.resize(somesize); and/or valueData.push_back(somecomplexnumber); etc....

Upvotes: 4

John Zwinck
John Zwinck

Reputation: 249652

There are a few things at play here.

Virtual Memory

First, you need to understand that just because your program is "using" 5 GB of memory does not mean that there are only 3 GB of RAM left for other programs. Virtual memory means that those 5 GB might be only 1 GB of actual "resident" data, and the other 4 GB may actually be on disk rather than in RAM. So it's important to look at the "resident set size" rather than the "virtual size" when you're looking at your program. And note that if your system actually runs low on RAM, the OS may shrink the RSS of some programs by "paging out" some of their memory. So don't worry too much about "5 GB" appearing in the system monitor--worry if you have a real, concrete performance problem.

Heap Allocation

The second aspect is why your virtual size does not decrease as you remove items from the queue. We can guess that you put those elements into the queue by creating them with malloc or new one-by-one, then pushing them onto the back of the queue. This means that the first element you allocated will come out of the queue first. And that in turn means that when you have drained 90% of the queue, your memory allocation might look like this:

[program|------------------unused-------------------|pxData]

The problem here is that in the real world, just because you free or delete something does not mean the operating system instantly reclaims that memory. In fact, it may not be able to reclaim any unused spans unless they are at the "end" (i.e. most recently allocated). Since C++ does not have garbage collection and cannot move items around in memory without your consent, you end up with this big "hole" in your program's virtual memory. That hole would be used to satisfy future memory allocation requests, but if you haven't got any, it just sits there, until the queue is completely empty:

[program|------------------unused--------------------------]

Then the system is able to shrink your virtual address space back down:

[program]

Which brings you back to where you started.

Solutions

If you want to "fix" this, one option is to allocate your memory in "reverse", i.e. put the last items allocated into the front of the queue.

Another option is to allocate the elements for the queue via mmap, which is something that e.g. Linux will do automatically for allocations which are "large." You can change the threshold for this by calling mallopt(3) with M_MMAP_THRESHOLD and setting it to be a little bit smaller than your struct size. This makes the allocations independent of each other, so the OS can reclaim them individually. This technique can even be applied to existing programs without recompilation, so is often useful if you need to solve this problem in a program you cannot modify.

Upvotes: 6

Related Questions