terdon
terdon

Reputation: 3380

Is it faster to run one process that spawns N threads or to run N processes?

I realize that this will largely depend on the processes in question, but is there a rule of thumb?

Say I have a multi-threaded program called progX that offers a command line switch (--cpu) controlling the number of CPUs it can use. Is it faster to launch 40 parallel instances using one CPU each (progX --cpu 1) or to launch a single instance, telling it to use 40 CPUs (progX --cpu 40)?

Upvotes: 6

Views: 1597

Answers (4)

Pragmateek
Pragmateek

Reputation: 13374

The general rule is : without any relations between your tasks the performance of the multi-process version will tend to the one of the multi-threaded version (some OS implements threads with processes so performance will be strictly equivalent).

You'll pay more to initialize your process, especially if you're using some hosting environment (e.g. Java or .Net) but over-time this initial startup fees will become negligible.

So if you have small tasks difference could be huge but if you run your tasks during hours it will be negligible.

Things become interesting when there is some interaction between your threads :

  • shared data : sharing memory between process is more involved and costly than between threads

  • synchronization : synchronizing processes is too more cumbersome especially if you can use language construct for transparent thread-synchronization

Performance is not a good fit for multi-process, but there is other good reasons to use it like reliability : if you use some component that may break and crash the process, if you have a multi-threaded application a failure will cause the loss of all your tasks whereas with a multi-process application only one will fail.

Upvotes: 2

tucuxi
tucuxi

Reputation: 17945

The most precise answer is to take a smallish task and time your application with a variety of settings. You can launch N processes of M cpus each via

#!/bin/bash
M=$1
N=$2
for ((i=0; i<$N; i++)) ; do ( echo $i && time progX --cpu $M & ) ; done

The important time is the last one printed (all processes should start in parallel more or less at the same time).

Upvotes: 1

canolucas
canolucas

Reputation: 1490

it is faster to lauch a single instance by far. Threads are made for that purpose, and they are lighter that processes. The de-facto rule is: let the OS do the scheduling and memory management unless you need to do the dirty job by yourself. This way your code will be much simpler and cleaner. The OS has a bunch of lower level tools to handle processes and memory much more efficiently. Of course it will depend on the OS, but this is a general rule for modern OS, and at least the one i use (Linux).

Upvotes: 2

Tomasz Nurkiewicz
Tomasz Nurkiewicz

Reputation: 340903

Largely depends on the OS, but in general threads are more lightweight than processes (in fact each process is composed of at least one thread) so by starting one process with 40 threads you'll put less pressure (especially with regards to memory consumption) on the system.

Also remember that threads are fundamentally different than processes as they operate on shared address space. But it's irrelevant if you they communicate with each other.

Upvotes: 2

Related Questions