Kuhlambo
Kuhlambo

Reputation: 381

bash calculations with numbers from files

I am trying to do a simple thing:

To get the second number in the the line with the second occurence of the word TER and lower it by one and further process it. The tr -s ' ' is there because the file is not delimited by tabs, but by different amounts of whitespaces.

My script:

first_res_atombumb= grep 'TER' tata_sbox_cuda.pdb | head -n 2 | tail -1 |tr -s ' '| cut -f 2 -d ' '

echo $((first_res_atombumb-1))

but this only returnes:

255

-1

Of course I want to have 254.

adding | tr -d '\n' does not help either, what on earth is going on? I have already asked several people at work noone seems to know.

the lines in question look linke this

TER     128      DA3     4 

TER     255      DA3     8 

and if I apply grep 'TER' tata_sbox_cuda.pdb | head -n 2 | tail -1 | tr -s ' '| cut -f 2 -d ' ' in the command line i get what i expect, just 255

Upvotes: 0

Views: 49

Answers (2)

glenn jackman
glenn jackman

Reputation: 246807

With bash, I'd write

n_ter=0
while read -a words; do 
    if [[ ${words[0]} == TER ]] && (( ++n_ter == 2 )); then 
        echo $(( ${words[1]} - 1 ))
    fi
done < file

but I'd use awk

awk '$1 == "TER" && ++n == 2 {print $2 - 1}' file

The problem with your code: you forgot to use the $() command substitution syntax

first_res_atombumb= grep 'TER' tata_sbox_cuda.pdb | head -n 2 | tail -1 |tr -s ' '| cut -f 2 -d ' '
# .................^...............................................................................^
echo $((first_res_atombumb-1))

You're setting the variable to an empty string in the environment of the grep command. Then, since you're not capturing the output of that pipeline, "255" is printed to the terminal. Because the variable is unset in your current shell, you get echo $((-1))

All you need is:

first_res_atombumb=$(grep 'TER' tata_sbox_cuda.pdb | head -n 2 | tail -1 |tr -s ' '| cut -f 2 -d ' ')
# .................^^...............................................................................^

But I'd still use awk.

Upvotes: 2

Yaron
Yaron

Reputation: 1242

If I understand your problem correctly you can solve it using AWK:

awk 'BEGIN{v=0} $1 == "TER" {v++;if (v==2) {print $2-1 ;exit}}' tata_sbox_cuda.pdb

Explanation:

  1. BEGIN{v=0} declaring and nulling the variable.
  2. $1 == "TER" execute the command in {} only if it's the second occurence of TER.
  3. {v++;if (v==2) {print $2-1 ;exit}}' increase the value of v and check if it's 2, in this case subtract 1 from the second field and display, exit afterwards (will make the processing faster and will skip unnecessary lines).

Upvotes: 0

Related Questions