Joel
Joel

Reputation: 67

how to extract columns from a text file with bash

I have a text file like this.

 res          ABS   sum     
 SER A   1   161.15 138.3  
 CYS A   2    66.65  49.6  
 PRO A   3    21.48  15.8  
 ALA A   4    77.68  72.0  
 ILE A   5    15.70   9.0  
 HIS A   6    10.88   5.9 

I would like to extract the names of first column(res) based on the values of last column(sum). I have to print resnames if sum >25 and sum<25. How can I get the output like this?

Upvotes: 1

Views: 13557

Answers (5)

jpmuc
jpmuc

Reputation: 1154

what about the good old cut? :)

say you would like to have the second column,

cat your_file.txt | sed 's, +, ,g' | cut -d" " -f 2

what is doing sed in this command? cut expects columns to be separated by a character or a string of fixed length (see documentation).

Upvotes: 0

user unknown
user unknown

Reputation: 36229

while read line
do 
v=($line)
sum=${v[4]}
((${sum/.*/} >= 25)) && echo ${v[0]}
done < file

You need to skip the first line.

Since bash doesn't handle floating point values, this will print 25 which isn't exactly bigger than 25.

This can be handled with calling bc for arithmetics.

tail -n +2 ser.dat | while read line
do  
  v=($line)
  sum=${v[4]}
  gt=$(echo "$sum > 25" | bc) && echo ${v[0]}
done

Upvotes: 0

Will Demaine
Will Demaine

Reputation: 1396

Consider using awk. Its a simple tool for processing columns of text (and much more). Here's a simple awk tutorial which will give you an overview. If you want to use it within a bash script, then this tutorial should help.

Run this on the command line to give you an idea of how you could do it:

> echo "SER A   1   161.15 138.3" | awk '{ if($5 > 25) print $1}'
> SER
> echo "SER A   1   161.15 138.3" | awk '{ if($5 > 140) print $1}'
> 

Upvotes: 1

nullpotent
nullpotent

Reputation: 9260

This should do it:

awk 'BEGIN{FS=OFS=" "}{if($5 != 25) print $1}' bla.txt

Upvotes: 1

Tim Pote
Tim Pote

Reputation: 28029

While you can do this with a while read loop in bash, it's easier, and most likely faster, to use awk

awk '$5 != 25 { print $1 }'

Note that your logic print resnames if sum >25 and sum<25 is the same as print if sum != 25.

Upvotes: 1

Related Questions