Reputation: 1
New guy here with a problem that will hopefully have an easy solution, but I just can't seem to manage.
So, I have a large list of files that I need to process using the same command line program, and I'm trying to write a small shell script to automate this. I wrote something that will read the input file name from a text file, and repeat the command for each of those files. So far so good. My problem though is with naming the output. Each file is named in the general format "lane_number_bla_bla_bla", and they are processed in pairs. So, there will be a "lane_1_bla_bla_bla_001" and "lane_1_bla_bla_bla_002" that need to combine into a single output file. For this, I'm trying to use awk to read the sample number from the .txt list of input files and parse it into the output file number. Here's the code I came up with (note that the echo statement before the command is there just for testing; it's removed when it comes to run the actual program; also this is not the actual command which is rather more complicated, but the principle still applies):
echo "Which input1 should I use?"
read text
input1=$text
echo "Which input2 should I use?"
read text
input2=$text
echo "How many lines?"
read text
n=$text
for i in $(seq 1 $n)
do
awkinput1=$(awk NR==$i $input1)
awkinput2=$(awk NR==$i $input2)
num=$(awk 'NR==$i{print $2 }' FS="_" $input1)
lane=$(awk 'NR==$i{print $1 }' FS="_" $input1)
echo "command $awkinput1.in > $awkinput1.out && command $awkinput2.in > $awkinput2.out && command cat $awkinput1.out $awkinput2.in > $num-$lane-CAT.out &"
if (( $i % 10 == 0 )); then wait; fi # Limit to 10 concurrent subshells.
done
When I run this, both $awkinput fields get replaced properly in the comand line by the appropriate filename, but not the $num and $lane fields, which print nothing.
So, what am I doing wrong? I'm sure it's pretty simple, but I tried quite a lot of different ways to format the relevant awk command, and nothing seems to work. I'm doing this on a remote linux server using SSH protocol, if it makes a difference.
Thanks a lot!
Upvotes: 0
Views: 477
Reputation: 1867
$i
quoted by single quote ('
). So quoted string should be terminated before $i
.FS
should be set before parsing lines.Following code will work.
num=$(awk 'BEGIN{FS="_"} NR=='$i'{print $2 }' $input1)
lane=$(awk 'BEGIN{FS="_"} NR=='$i'{print $1 }' $input1)
Code below will be more efficient:
while read in1 ; do
read in2 <&3
num=$(awk 'BEGIN{FS="_"} {print $2 }' <<<"$in1")
lane=$(awk 'BEGIN{FS="_"} {print $1 }' <<<"$in1")
...
done <$input1 3<$input2
Upvotes: 1