Reputation: 2684
I'm working with some files that are organized within a folder (named RAW
) that contain several other folders with different names, all of them containing files ended by a string like _1
or _2
with the extension (.fq.gz
in this case). Below I try to include a schedule for guidance.
RAW/
FOLDER1/
FILE_qwer_1.fa.gz
FILE_qwer_2.fa.gz
FOLDER2/
FILE_tyui_1.fa.gz
FILE_tyui_2.fa.gz
OTHER1/
FILE_asdf_1.fa.gz
FILE_asdf_2.fa.gz
...
So I am basically running a loop over all those directories under RAW
and run a script that will create an output file, say out
.
What I'm trying to accomplish is to name that out
file as the folder it belongs to under $RAW
(e.g. FOLDER1.eg
after processing FILE_qwer_1.fa.gz and FILE_qwer_2.fa.gz above)
The loop below will work actually, but as you can imagine, it depends on how many folders I am working below the root /
, as the option -f
is hard-coded for the cut
command.
for file1 in ${RAW}/*/*_1.fq.gz; do
file2="${file1/_1/_2}"
out="$(echo $file1 | cut -d '/' -f2)"
bash script_to_be_run.sh $file1 $file2 $out
done
Ideally, the variable out
should be named as the replacement of the first *
character of the glob used in the loop (e.g. FOLDER1.eg
in the first iteration) followed by a custom extension, but I do not really know how to do it, nor if it is possible.
Upvotes: 1
Views: 97
Reputation: 2304
You can use ${var#prefix}
to remove a prefix from the start of a variable.
for file1 in ${RAW}/*/*_1.fq.gz; do
file2="${file1/_1/_2}"
out="$(dirname "${file1#$RAW/}")" # cuts the $RAW from the beginning of the dirs
bash script_to_be_run.sh "$file1" "$file2" "$out"
done
(It's a good idea to quote variable expansions in case they contain spaces or other special character: "$file1"
is safer than $file1
.)
Upvotes: 2