Reputation: 1371
I have a program with three pools of structs. For each of them I use a list a of used structs and another one for the unused structs. During the execution the program consumes structs, and returns them back to the pool on demand. Also, there is a garbage collector to clean the "zombie" structs and return them to the pool.
At the beginning of the execution, the virtual memory, as expected, shows around 10GB* of memory allocated, and as the program uses the pool, the RSS memory increases.
Although the used nodes are back in the pool, marked as unused nodes, the RSS memory do not decreases. I expect this, because the OS doesn't know about what I'm doing with the memory, is not able to notice if I'm doing a real use of them or managing a pool.
What I would like to do is to force the unused memory to go back to virtual memory whenever I want, for example, when the RSS memory increases above X GB.
Is there any way to mark, given the memory pointer, a memory area to put it in virtual memory? I know this is the Operating System responsability but maybe there is a way to force it.
Maybe I shouldn't care about this, what do you think?
Thanks in advance.
I provide a picture of the pool usage vs the memory usage, for a few files. As you can see, the sudden drops in the pool usage are due to the garbage collector, what I would like to see, is this drop reflected in the memory usage.
Upvotes: 5
Views: 530
Reputation: 1323743
Git 2.19 (Q3 2018) offers an example of memory pool of struct, using mmap
, not malloc
.
For a large tree, the index needs to hold many cache entries allocated on heap.
These cache entries are now allocated out of a dedicated memory pool to amortize malloc(3)
overhead.
See commit 8616a2d, commit 8e72d67, commit 0e58301, commit 158dfef, commit 8fb8e3f, commit a849735, commit 825ed4d, commit 768d796 (02 Jul 2018) by Jameson Miller (jamill
).
(Merged by Junio C Hamano -- gitster
-- in commit ae533c4, 02 Aug 2018)
block alloc
: allocate cache entries frommem_pool
When reading large indexes from disk, a portion of the time is dominated in
malloc()
calls.
This can be mitigated by allocating a large block of memory and manage it ourselves via memory pools.This change moves the cache entry allocation to be on top of memory pools.
Design:
The
index_state
struct will gain a notion of an associatedmemory_pool
from which cache_entries will be allocated from.
When reading in the index from disk, we have information on the number of entries and their size, which can guide us in deciding how large our initial memory allocation should be.
When an index is discarded, the associatedmemory_pool
will be discarded as well - so the lifetime of acache_entry
is tied to the lifetime of theindex_state
that it was allocated for.In the case of a Split Index, the following rules are followed.
1st, some terminology is defined:Terminology:
- '
the_index
': represents the logical view of the index- '
split_index
': represents the "base" cache entries. Read from the split index file.'
the_index
' can reference a singlesplit_index
, as well ascache_entries
from thesplit_index
.the_index
will be discarded before thesplit_index
is.
This means that when we are allocatingcache_entries
in the presence of a split index, we need to allocate the entries from thesplit_index
's memory pool.This allows us to follow the pattern that
the_index
can reference cache_entries from thesplit_index
, and that thecache_entries
will not be freed while they are still being referenced.Managing transient cache_entry structs:
Cache entries are usually allocated for an index, but this is not always the case. Cache entries are sometimes allocated because this is the type that the existing
checkout_entry
function works with.
Because of this, the existing code needs to handle cache entries associated with an index / memory pool, and those that only exist transiently.
Several strategies were contemplated around how to handle this.Chosen approach:
An extra field was added to the
cache_entry
type to track whether the cache_entry was allocated from a memory pool or not.
This is currently anint
field, as there are no more available bits in the existingce_flags
bit field.
If / when more bits are needed, this new field can be turned into a proper bit field.We decided tracking and iterating over known memory pool regions was less desirable than adding an extra field to track this state.
Upvotes: 1
Reputation: 6607
You can do this as long as you are allocating your memory via mmap and not via malloc. You want to use the madvise function with the POSIX_MADV_DONTNEED
argument.
Just remember to run madvise with POSIX_MADV_WILLNEED
before using them again to ensure there is actually memory behind them.
This does not actually guarantee the pages will be swapped out but gives the kernel a strong hint to do so when it has time.
Upvotes: 3