Jérôme Pouiller
Jérôme Pouiller

Reputation: 10197

What is the performance of C/C++ allocator in multithread context?

When memory is allocated using new or malloc, allocator may have to protect itself against re-entrance. I see two ways to do this:

I think most of allocators use the second method, but I cannot find a proof of this.

Do you know which allocator use which method? Is there any standard about that?

Upvotes: 3

Views: 2194

Answers (3)

Jérôme Pouiller
Jérôme Pouiller

Reputation: 10197

Google perf tools provide an allocator named TCMalloc. This allocator use a pool of memory for each thread (= "thread caching system"). Documentation shows performance improvements measurements over glibc 2.3.

Glibc use a pool of memory for each thread since 2.16.

Therefore, there are no more performance differences now:

Fedora [we] used to use tcmalloc for QEMU for a while. Then we checked performance again and found that the delta to glibc's native malloc had essentially gone

Also notice that C++ new operator call malloc function provided by libc (= glibc malloc in most cases).

So:

  1. No, this behavior is not standardized
  2. It use a pool per thread only (and only if) you use glibc >= 2.16, else you can try to compile with TCMalloc.

Upvotes: 4

alexbuisson
alexbuisson

Reputation: 8469

All multi threaded program analysis, I made with the intel parallel studio (under windows) always show lock event and kernel time due to allocation. It means that the C++ new of the VS'08 compiler is mainly using mutex to stay memory coherent.

Each time it become an issue in the software I develop, I try to use RIA idiom and remove dynamic/shared memory, or use a TLS allocator if the memory has only to be use by the thread itself.

Upvotes: 0

Jérôme Pouiller
Jérôme Pouiller

Reputation: 10197

C++17 begin to specify behavior of allocators in threaded applications:

  • std::pmr::unsynchronized_pool_resource is not thread-safe, and cannot be accessed from multiple threads simultaneously
  • std::pmr::synchronized_pool_resource may be accessed from multiple threads without external synchronization, and may have thread-specific pools to reduce synchronization costs. If the memory resource is only accessed from one thread, unsynchronized_pool_resource is more efficient.

Upvotes: 0

Related Questions