Remove duplicate records from a three column file

Question

I am trying to remove some entries from a txt file containing 3 columns. The first two contains ID entries and the third one, contains its percentage as follows:

ID#3    ID#1    100.00
ID#4    ID#4    40.00
ID#4    ID#5    33.065
ID#5    ID#5    100.000    
ID#5    ID#4    33.065
ID#6    ID#6    100.000

I want to "remove" every entry with the same ID BUT ONLY WHEN the percentage is 100% so as the desired output will be like:

ID#3    ID#1    100.00    
ID#4    ID#4    40.00
ID#4    ID#5    33.065
ID#5    ID#4    33.065

I tried this:

cat file.txt | awk '$3!=100.0 && $1=$2 {print $1,$2}'

but I cant find a way to include the cases when the first two columns are not the same!

RavinderSingh13 · Accepted Answer

Could you please try following.

awk '($1==$2) && $NF==100{next} 1' Input_file

Explanation: Adding detailed explanation for above code.

awk '                       ##Starting awk program from here.
($1==$2) && $NF==100{       ##Checking condition if $1(first field) equals to $2(2nd field) AND $NF(last field) equals 100 then do following.
  next                      ##next will SKIP all further statements from here.
}                           ##Closing BLOCK for above condition here.
1                           ##Mentioning 1 will print edited/non-edited lines here.
' Input_file                ##Mentioning Input_file name here.

Remove duplicate records from a three column file

Answers (2)

Related Questions