ATpoint
ATpoint

Reputation: 878

Basename in GNU Parallel

I have hundreds of files, named as follows:

RG1-t.txt

RG1-n.txt

RG2-t.txt

RG2-n.txt

etc...

I would like to use GNU parallel to run scripts on them, but I struggle to get the basenames of the files, so RG1, RG2 etc... so that I can run:

ls RG*.txt | parallel "command.sh {basename}-t.txt {basename}-n.txt > {basename}.out"

resulting in the files RG1.out, RG2.out etc. Any ideas?

Upvotes: 12

Views: 6007

Answers (3)

Ole Tange
Ole Tange

Reputation: 33740

Use --rpl:

printf '%s\0' RG*-n.txt |
  parallel -0 --rpl '{basename} s/-..txt$//' "command.sh {basename}-t.txt {basename}-n.txt > {basename}.out"

Or dynamic replacement strings with --plus:

printf '%s\0' RG*-n.txt |
  parallel -0 --plus "command.sh {%-n.txt}-t.txt {} > {%-n.txt}.out"

The printf avoids:

bash: /bin/ls: Argument list too long

Upvotes: 4

jaygooby
jaygooby

Reputation: 2536

Use the built-in stripping options:

  1. Dirname ({/}) and basename ({%}) and remove custom suffix ({^suffix})

    $ echo dir/file_1.txt.gz | parallel --plus echo {//} {/} {%_1.txt.gz}

  2. Get basename, and remove last ({.}) or any ({:}) extension

    $ echo dir.d/file.txt.gz | parallel 'echo {.} {:} {/.} {/:}'

This should do what you need:

ls RG*.txt | parallel "command.sh {.}-t.txt {.}-n.txt > {.}.out"

Upvotes: 23

Mark Setchell
Mark Setchell

Reputation: 207798

Try feeding parallel like this:

ls RG*t.txt | cut -d'-' -f1 | parallel 'command.sh {}-t.txt {}-n.txt > {}.out'

Or, if you prefer awk:

ls RG*t.txt | awk -F'-' '{print $1}' | parallel ...

Or, if you prefer sed:

ls RG*t.txt | sed 's/-.*//' | parallel ...

Or, if you prefer GNU grep:

ls RG* | grep -Po '.*(?=-t.txt)' | parallel ...

Upvotes: 2

Related Questions