Reputation: 30615
I have a C++ app on Linux which is extremely latency sensitive. My memory usage is around 2GB, so with 4kb pages and 64 TLB entries, I am going to be encountering TLB misses.
I read in the Intel developer manuals the 2MB (or 4MB?) "huge" pages only reduce the number of TLB entries by half, so the increase in memory range offsets the reduction in TLB entries and it would be better for performance.
How do I allocate memory using "huge" pages in a C++ application? Are there any trade-offs I should be aware of?
My Linux is a Red Hat distribution.
Upvotes: 29
Views: 23209
Reputation: 64955
You can also try to use transparent huge page support which is available on any kernel from the last several years (at least anything in the 3.x and 4.x range and also various 2.6.x kernels).
The primary benefit is that you don't need to have any special "hugetlbfs" set up, it "just works". The downside is that it is not guaranteed: the kernel may satisfy your allocations with huge pages if some conditions are met and some are available. Unlike hugetlbfs
which reserves a fixed number of huge pages at startup, which are only available via specific calls, transparent huge pages carves huge pages out of the general memory pool. This requires contiguous 2MB blocks of physical memory which may become rare as the system remains up time due to physical memory fragmentation.
Furhtermore, there are various kernel tunables which affect whether you get a hugepage or not, the most important of which is /sys/kernel/mm/transparent_hugepage/enabled
.
Your best bet is to allocate blocks on a 2MB boundary with posix_memalign
and then do a madvise(MADV_HUGEPAGE)
on the allocated region before touching it for the first time. It also works with variants like aligned_alloc
. In my experience, on systems that have /sys/kernel/mm/transparent_hugepage/enabled
set to always
this generally results in a hugepage. However, I've mostly used on systems with significant free memory and not-too-long uptime.
If you are using 2GB of memory, you could probably get a significant benefit from huge pages. If you allocate that all in small blocks, e.g. via malloc
there is a high chance transparent hugepages won't kick in, so you can also consider allocating in a THP-aware way whatever is using the bulk of your memory (often it is a single object type).
I also wrote a library to determine if you actually got hugepages from any given allocation. This probably isn't useful in a production application, but it can be a helpful diagnostic if you go the route of trying to use THP since at least you can determine if you got them or not.
Upvotes: 19
Reputation: 20631
The "hugetlb" documentation from the kernel should help here.
Users can use the huge page support in Linux kernel by either using the mmap system call or standard SYSV shared memory system calls (shmget, shmat).
And:
Examples
1) map_hugetlb: see tools/testing/selftests/vm/map_hugetlb.c
2) hugepage-shm: see tools/testing/selftests/vm/hugepage-shm.c
3) hugepage-mmap: see tools/testing/selftests/vm/hugepage-mmap.c
4) The libhugetlbfs (https://github.com/libhugetlbfs/libhugetlbfs) library provides a wide range of userspace tools to help with huge page >usability, environment setup, and control.
(These paths refer to the linux source tree).
So it basically boils down to:
mmap
with MAP_HUGETLB
flagUpvotes: 11
Reputation: 1822
I am assuming you need huge pages only for specific application written in C++ otherwise you just change the page size of your system. Below method will work fine for applications written in any language.
In order to use huge pages for specific application you need to build your kernel for the support of huge page support. you must build kernel with CONFIG_HUGETLBFS
options
Specify page size by specifying
hugepagesz=<size>
on boot command line
To see how to set boot parameters: http://www.cyberciti.biz/tips/10-boot-time-parameters-you-should-know-about-the-linux-kernel.html
To set the no of huge pages use
# echo 20 > /proc/sys/vm/nr_hugepages
To check the huge pages (available, total, …)
# cat /proc/meminfo
When all above goes fine, now you have to work with “how to use these pages for particular application”: mount file system of type hugetlbfs
as
# mount -t hugetlbfs -o uid=<value>,gid=<value>,mode=<value>,pagesize=<value>,size=<value>,min_size=<value>,nr_inodes=<value> none /mnt/huge
place your application on this mount /mnt/huge
boom now your application will use page size set by you!
For more details check https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt
Merits / demerits of huge pages:
merits: efficiency due to reduction in TLB miss, less page faults, reduced page table size along with less translations
demerits: more internal fragmentation: loss of memory, more latency in swapping (HUGETLBFS
pages does not swapp out their mapping is permanent)
for more details check https://lwn.net/Articles/359158/
EDIT There is also API available to allocate huge pages plz check perhaps it helps
https://github.com/libhugetlbfs/libhugetlbfs/blob/master/HOWTO
https://lwn.net/Articles/375096/
Upvotes: 7