Can mmap's performance be improved for shared memory?

Question

I'm writing a small program for Wayland that uses software rendering and wl_shm for display. This requires that I pass a file descriptor for my screen buffer to the Wayland server, which then calls mmap() on it, i.e. the screen buffer must be shareable between processes.

In this program, startup latency is key. Currently, there is only one remaining bottleneck: the initial draw to the screen buffer, where the entire buffer is painted over. The code below shows a simplified version of this:

#define _GNU_SOURCE
#include 
#include 
#include 
#include 
#include 

int main()
{
    /* Fullscreen buffers are around 10-30 MiB for common resolutions. */
    const size_t size = 2880 * 1800 * 4;
    int fd = memfd_create("shm", 0);
    ftruncate(fd, size);
    void *pool = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);

    /* Ideally, we could just malloc, but this memory needs to be shared. */
    //void *pool = malloc(size);

    /* In reality this is a cairo_paint() call. */
    memset(pool, 0xCF, size);

    /* Subsequent paints (or memsets) after the first take negligible time. */
}

On my laptop, the memset() above takes around 21-28 ms. Switching to malloc()'ed memory drops this to 12 ms, but the problem is that the memory needs to be shared between processes. The behaviour is similar on my desktop: 7 ms for mmap(), 3 ms for malloc().

My question is: Is there something I'm missing that can improve the performance of shared memory on Linux? I've tried madvise() with MADV_WILLNEED and MADV_SEQUENTIAL, and using mlock(), but none of those made a difference. I've also thought about whether 2MB Huge Pages would help given the buffer sizes of around 10-30 MB, but that's not usually available.

Edit: I've tried mmap() with MAP_ANONYMOUS | MAP_SHARED, which is just as slow as before. MAP_ANONYMOUS | MAP_PRIVATE results in the same speed as malloc(), however that defeats the purpose.

Can mmap's performance be improved for shared memory?

Answers (1)

Related Questions

Can mmap&#39;s performance be improved for shared memory?

Answers (1)

Related Questions

Can mmap's performance be improved for shared memory?