Yamuna_dhungana
Yamuna_dhungana

Reputation: 663

Extract rows from table where values are less than and greater than in columns in shell

I have a very large tab separated table (24 gb in size) with C1 and C2 , C3 and C4 columns as shown below. I would like to extract rows that have C1 < 0.6 and C2 < 0.4. How do I do in unix/ shell using logical operators?

C1     C2    C3     C4
0.8    0.1   A1     C.a 
0.2    0.3   A2     C.b
0.5    0.8   A3     C.c
0.1    0.1   A4     C.c

Result I expect:

C1     C2    C3     C4
0.2    0.3   A2     C.b
0.1    0.1   A4     C.c

Upvotes: 0

Views: 945

Answers (2)

RavinderSingh13
RavinderSingh13

Reputation: 133458

1st solution: This simple awk should do the job for you.

awk 'FNR==1 || ($1<.6 && $2<.4)' Input_file

OR for tab separated Input_file try following:

awk 'BEGIN{FS=OFS="\t"}FNR==1 || ($1<.6 && $2<.4)' Input_file


2nd solution(Generic one): In case you don't want to hard code field number of field c1 and c2 and want to get it programmatically then try following. Add BEGIN{FS=OFS="\t"} in following in case your Input_file is TAB delimited.

awk -v c1Thre="0.6" -v c2Thre="0.4" '
FNR==1{
  for(i=1;i<=NF;i++){
    if($i=="C1"){ C1Field=i }
    if($i=="C2"){ C2Field=i }
  }
  print
  next
}
$C1Field<c1Thre && $C2Field<c2Thre
'  Input_file

Upvotes: 1

dev
dev

Reputation: 961

try this : I have removed spaces ( there are 3/4 spaces ) and changed them to "," for processing :

cat mydata.txt | tr -s " " "," | awk -F"," 'BEGIN { X = NF } { for (i = 0; i <= X; i = i + 1) if($1 < 0.6 && $2<0.4) print $0}'

Upvotes: 0

Related Questions