Reputation: 80276
I have a bunch of commands I would like to execute in parallel. The commands are nearly identical. They can be expected to take about the same time, and can run completely independently. They may look like:
command -n 1 > log.1
command -n 2 > log.2
command -n 3 > log.3
...
command -n 4096 > log.4096
I could launch all of them in parallel in a shell script, but the system would try to load more than strictly necessary to keep the CPU(s) busy (each task takes 100% of one core until it has finished). This would cause the disk to thrash and make the whole thing slower than a less greedy approach to execution.
The best approach is probably to keep about n
tasks executing, where n
is the number of available cores.
I am keen not to reinvent the wheel. This problem has already been solved in the Unix make
program (when used with the -j n
option). I was wondering if perhaps it was possible to write generic Makefile rules for the above, so as to avoid the linear-size Makefile that would look like:
all: log.1 log.2 ...
log.1:
command -n 1 > log.1
log.2:
command -n 2 > log.2
...
If the best solution is not to use make
but another program/utility, I am open to that as long as the dependencies are reasonable (make
was very good in this regard).
Upvotes: 4
Views: 3798
Reputation: 33685
With GNU Parallel you would write:
parallel command -n {} ">" log.{} ::: {1..4096}
10 second installation:
(wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash
Learn more: http://www.gnu.org/software/parallel/parallel_tutorial.html https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
Upvotes: 2
Reputation: 19756
Here is more portable shell code that does not depend on brace expansion:
LOGS := $(shell seq 1 1024)
Note the use of := to define a more efficient variable: the simply expanded "flavor".
Upvotes: 4
Reputation: 31718
xargs -P is the "standard" way to do this. Note depending on disk I/O you may want to limit to spindles rather than cores. If you do want to limit to cores note the new nproc command in recent coreutils.
Upvotes: 3
Reputation: 99094
First the easy part. As Roman Cheplyaka points out, pattern rules are very useful:
LOGS = log.1 log.2 ... log.4096
all: $(LOGS)
log.%:
command -n $* > log.$*
The tricky part is creating that list, LOGS
. Make isn't very good at handling numbers. The best way is probably to call on the shell. (You may have to adjust this script for your shell-- shell scripting isn't my strongest subject.)
NUM_LOGS = 4096
LOGS = $(shell for ((i=1 ; i<=$(NUM_LOGS) ; ++i)) ; do echo log.$$i ; done)
Upvotes: 3
Reputation: 38718
See pattern rules
Another way, if this is the single reason why you need make
, is to use -n
and -P
options of xargs
.
Upvotes: 3