Reputation: 610
In a cross platform (Linux and windows) real-time application, I need the fastest way to share data between a C++ process and a python application that I both manage. I currently use sockets but it's too slow when using high-bandwith data (4K images at 30 fps).
I would ultimately want to use the multiprocessing shared memory but my first tries suggest it does not work. I create the shared memory in C++ using Boost.Interprocess and try to read it in python like this:
#include <boost/interprocess/shared_memory_object.hpp>
#include <boost/interprocess/mapped_region.hpp>
int main(int argc, char* argv[])
{
using namespace boost::interprocess;
//Remove shared memory on construction and destruction
struct shm_remove
{
shm_remove() { shared_memory_object::remove("myshm"); }
~shm_remove() { shared_memory_object::remove("myshm"); }
} remover;
//Create a shared memory object.
shared_memory_object shm(create_only, "myshm", read_write);
//Set size
shm.truncate(1000);
//Map the whole shared memory in this process
mapped_region region(shm, read_write);
//Write all the memory to 1
std::memset(region.get_address(), 1, region.get_size());
std::system("pause");
}
And my python code:
from multiprocessing import shared_memory
if __name__ == "__main__":
shm_a = shared_memory.SharedMemory(name="myshm", create=False)
buffer = shm_a.buf
print(buffer[0])
I get a system error FileNotFoundError: [WinError 2] : File not found
. So I guess it only works internally in Python multiprocessing, right ? Python seems not to find the shared memory created on C++ side.
Another possibility would be to use mmap but I'm afraid that's not as fast as "pure" shared memory (without using the filesystem). As stated by the Boost.interprocess documentation:
However, as the operating system has to synchronize the file contents with the memory contents, memory-mapped files are not as fast as shared memory
I don't know to what extent it is slower however. I just would prefer the fastest solution as this is the bottleneck of my application for now.
Upvotes: 4
Views: 5312
Reputation: 49
For future viewers, I fixed this error by using windows_shared_memory instead of shared_memory_object.
Upvotes: 1
Reputation: 956
An example of communication between C++ and python, using shared memory and memory mapping can be found in https://stackoverflow.com/a/69806149/2625176 .
Upvotes: 2
Reputation: 610
So I spent the last days implementing shared memory using mmap, and the results are quite good in my opinion. Here are the benchmarks results comparing my two implementations: pure TCP and mix of TCP and shared memory.
Benchmark consists of moving data from C++ to Python world (using python's numpy.nparray), then data sent back to C++ process. No further processing is involved, only serialization, deserialization and inter-process communication (IPC).
Case A:
Communication is done with TCP {header + data}.
Case B:
Communication is hybrid : synchronization is done through sockets (only header is passed) and data is moved through shared memory. I think this design is great because I have suffered in the past from problem of synchronization using condition variable in shared memory, and TCP is easy to use in both C++ and Python environments.
200 MBytes/s total: 10 MByte sample at 20 samples per second
Case | Global CPU consumption | C++ part | python part |
---|---|---|---|
A | 17.5 % | 10% | 7.5% |
B | 6% | 1% | 5% |
200 MBytes/s total: 0.2 MByte sample at 1000 samples per second
Case | Global CPU consumption | C++ part | python part |
---|---|---|---|
A | 13.5 % | 6.7% | 6.8% |
B | 11% | 5.5% | 5.5% |
In my application, using mmap has a huge impact on big data at average frequency (almost 300 % performance gain). When using very high frequencies and small data, the benefit of shared memory is still there but not that impressive (only 20% improvement). Maximum throughput is more than 2 times bigger.
Using mmap is a good upgrade for me. I just wanted to share my results here.
Upvotes: 3