Reputation: 525
I've recently discovered GNU parallel, and it is already incredibly useful, but I cannot figure out how to get all my output into any kind of usable structure. Here are my issues:
So: is there any way within parallel, i.e. without having to modify my command to either get the stderr when using --files or force --results to use sensible directory names?
EDIT: In response to comment, I've tried:
find controlFiles/ -name "*.txt" | parallel --files --tmpdir logs --tagstr {/.} -j15 --joblog logs/joblog --eta /path/to/command --opt --opt2 /path/to/data /path/to/output {} > logs/logfiles.txt
and
find controlFiles/ -name "*.txt" | parallel --files --results logs --tagstr {/.} -j15 --joblog logs/joblog --eta /path/to/command --opt --opt2 /path/to/data /path/to/output {} > logs/logfiles.txt
where the former loses stderr and the latter produces unusable directory names
EDIT2: After a bunch more testing, it seems I somehow got things into a really weird state. The directory structure from --results is supposed to be named after the arguments, but somehow mine was using the entire command. When I tried removing the existing logs directory and starting fresh with what I thought was the same command, I got the expected behavior. Still not ideal, but I can certainly live with it.
Upvotes: 2
Views: 2061
Reputation: 33740
The most obvious solution is to rename the long part of the dir after the jobs are done.
cd resultdir/1/
rename 's:long/common/string/to/remove::' */2/*
Another idea is to use the new .csv output (available from 20161222):
parallel --results foo.csv ...
which will generate a CSV-file with the content from --joblog, the arguments, stdout, and stderr. This is particularly handy if you want to post-process this in R or LibreCalc.
If you prefer mixed stderr/stdout, simply let 2>&1 be part of your command:
parallel '(echo joe; ls /doesnotexists {}) 2>&1' ::: bar > foo
From version 20170122 you can:
parallel --results out/{/.} mycommand
Upvotes: 4