Istvan
Istvan

Reputation: 8572

Processing file with xargs for concurrency

There is an input like:

folder1
folder2
folder3
...
foldern

I would like to iterate over taking multiple lines at once and processes each line, remove the first / (and more but for now this is enough) and echo the. Iterating over in bash with a single thread can be slow sometimes. The alternative way of doing this would be splitting up the input file to N pieces and run the same script with different input and output N times, at the end you can merge the results.

I was wondering if this is possible with xargs.

Update 1:

Input:

/a/b/c
/d/f/e
/h/i/j

Output:

mkdir a/b/c
mkdir d/f/e
mkdir h/i/j

Script:

for i in $(<test); do 
  echo mkdir $(echo $i | sed 's/\///') ; 
done

Doing it with xargs does not work as I would expect:

xargs -a test -I line --max-procs=2 echo mkdir $(echo $line | sed 's/\///')

Obviously I need a way to execute the sed on the input for each line, but using $() does not work.

Upvotes: 2

Views: 2121

Answers (2)

Ole Tange
Ole Tange

Reputation: 33725

With GNU Parallel you can do:

cat file | perl -pe s:/:: | parallel mkdir -p

or:

cat file | parallel mkdir -p {= s:/:: =}

Upvotes: 0

Thomas Moulard
Thomas Moulard

Reputation: 5542

You probably want:

--max-procs=max-procs, -P max-procs
          Run up to max-procs processes at a time; the default is  1.   If
          max-procs  is 0, xargs will run as many processes as possible at
          a time.  Use the -n option with -P; otherwise chances  are  that
          only one exec will be done.

http://unixhelp.ed.ac.uk/CGI/man-cgi?xargs

Upvotes: 3

Related Questions