AishwaryaKulkarni
AishwaryaKulkarni

Reputation: 784

Storing a line in a variable

Hi I have the following batch script where I submitted each file to a separate processing as follows:

for file in ../Positive/*.txt_rn; do
bsub <<EOF
#BSUB -L /bin/bash
#BSUB -W 150:00
#BSUB -M 10000
#BSUB -n 3
#BSUB -e /somefolder/errors/%J.err
#BSUB -o /somefolder/errors/%J.out
  while read line; do
  name=`cat \$line | awk '{print $1":"$2"-"$3}'`
  four=`cat \$line | awk '{print $4}' | cut -d\: -f4`
  fasta=\$name".fa"
  op=\$name".rs"
  echo \$name | xargs samtools faidx /somefolder/rn4/Rattus_norvegicus/UCSC/rn4/Sequence/WholeGenomeFasta/genome.fa > \$fasta
  Process -F \$fasta -M "list_"\$four".txt" -p 0.003 | awk '(\$5 >= 0.67)' > \$op
 if [ -s "\$op" ]
   then
cat "\$line" >> ../Positive_Strand/$file".cons"
fi
rm \$lne
rm \$op
rm \$fasta
done < $file
EOF
done 

I am am somehow unable to store the values of the column from the line (which is in $line variable into the $name and $four variable and hence unable to carry on further processes. Also any suggestions to edit the code for a better version of it would be welcome.

Upvotes: 0

Views: 104

Answers (1)

Adam Katz
Adam Katz

Reputation: 16118

If you change EOF to 'EOF' then you will more properly disable shell interpretation. Your problem is that your back-ticks (`) are not escaped.

I've fixed your indentation and cleaned up some of your code. Note that the syntax highlighting here doesn't understand cat <<'EOF'. If you paste that into vim with highlighting enabled, you'll see that block is all the same color since it's just a string.

bsub_helper() {
  cat <<'EOF'
#BSUB -L /bin/bash
#BSUB -W 150:00
#BSUB -M 10000
#BSUB -n 3
#BSUB -e /somefolder/errors/%J.err
#BSUB -o /somefolder/errors/%J.out
  while read line; do
    name=`cat $line | awk '{print $1":"$2"-"$3}'`
    four=`cat $line | awk '{print $4}' | cut -d: -f4`
    fasta="$name.fa"
    op="$name.rs"
    genome="/somefolder/rn4/Rattus_norvegicus/UCSC/rn4/Sequence/WholeGenomeFasta/genome.fa"
    echo $name | xargs samtools faidx "$genome" > "$fasta"
    Process -F "$fasta" -M "list_$four.txt" -p 0.003 | awk '($5 >= 0.67)' > "$op"
    if [ -s "$op" ]
    then
      cat "$line" >> "../Positive_Strand/$file.cons"
    fi
    rm "$lne" "$op" "$fasta"
EOF
  echo "  done < \"$1\""
}

for file in ../Positive/*.txt_rn; do
  bsub_helper "$file" |bsub
done 

I created a helper function because I needed to get the input in two commands. I am assuming that $file is the only variable in that block that you want interpreted. I also surrounded that variable (among others) with quotes so that the code can support file names with spaces in them. The final line of the helper has nested double quotes for this reason.

I left your echo $name | xargs … line alone because it's so odd. Without quotes around $name, xargs will take each whitespace-separated entry as its own file. With quotes, xargs will only supply one (likely invalid) file name to samtools.

If $name is a single file, try:

samtools faidx "$genome" "$name" > "$fasta"

If $name is multiple files and none of them have spaces, try:

samtools faidx "$genome" $name > "$fasta"

The only reason to use xargs here would be if you have too much content for one command line, but if you're running echo $name | xargs then you'll run into the same problem.

Upvotes: 1

Related Questions