Albert
Albert

Reputation: 68280

Fastest way to communicate with subprocess

I have a parent process which spawns several subprocesses to do some CPU intensive work. For each batch of work, the parent needs to send several 100MB of data (as one single chunk) to the subprocess and when it is done, it must receive about the same amount of data (again as one single chunk). The parent process and the subprocess are different applications and even different languages (Python and C++ mostly) but if I have any solution in C/C++, I could write a Python wrapper if needed.

I thought the simplest way would be to use pipes. That has many advantages, such as being mostly cross-platform, being simple, and flexible, and I can maybe even later extend my code without too much work to communicate over network.

However, now I'm profiling the whole application and I see some noticeable overhead in the communication and I wonder whether there are faster ways. Cross-platform is not really needed for my case (scientific research), it's enough if it works on Ubuntu >=12 or so (although MacOSX would also be nice). In principle, I thought that copying a big chunk of data into a pipe and reading it at the other end should not take much more time than setting up some shared memory and doing a memcpy. Am I wrong? Or how much performance would you expect is it worse?

The profiling itself is complicated and I don't really have reliable and exact data, only clues (because it's all a quite complicated system). I wonder where I should spent my time now. Trying to get more exact profiling data? Trying to implement some shared memory solution and see how much it improves?. Or something else? I also thought about wrapping and compiling the subprocess application in a library and linking it into the main process and thus avoiding the communication with another process - in that case I need just a memcpy.

There are quite a few related questions here on StackOverflow but I haven't really seen a performance comparison for different methods of communication.

Upvotes: 2

Views: 370

Answers (1)

Albert
Albert

Reputation: 68280

Ok, so I wrote a small benchmarking tool here which copies some data (~200MB) either via shared memory or via pipe, 10 times.

Results on my MacBook with MacOSX:

Shared memory:
   24.34 real        18.49 user         5.96 sys
Pipe: 
   36.16 real        20.45 user        17.79 sys

So, first we see that the shared memory is noticeably faster. Note that if I copy smaller chunks of data (~10MB), I almost don't see a difference in total time.

The second noticeable difference is the time spent in kernel. It is expected that the pipe needs more kernel time because the kernel has to handle all those reads and writes. But I would not have expected it to be that much.

Upvotes: 2

Related Questions