user1415722
user1415722

Reputation: 1

compare columns in the same file

My file has thousands of lines an it looks like

R4604                17131G1                   499456.1 1966201.0   0.0  1000001
R4604                17131G1                   499456.1 1966201.0   8.5  1000001
R4604                17131G1                   499456.1 1966201.0   8.5  1000001
R4604                17131G1                   499456.1 1966201.0   8.5  1000001
R4604                17131G1                   499456.1 1966201.0   8.5  1000001
R4604                17131G1                   499456.1 1966201.0   8.5  1000001
R4604                17131G1                   499456.1 1966201.0   8.5  1000001
R4604                17131G1                   499456.1 1966201.0   8.5  1000001
R4604                17131G1                   499456.1 1966201.0   8.5  1000001
R4496                12011G1                   473856.2 1960800.9   0.0  1000001
R4496                12011G1                   473856.2 1960800.9  64.0  1000001

what i want to get as output is

R4604                17131G1                   499456.1 1966201.0   8.5  1000001
R4496                12011G1                   473856.2 1960800.9  64.0  1000001

So if columns 1-5 are identical just get one line and if columns 1-4 are identical and have different values in column 5 remove the line with column 5 value 0.0

thanks for your help Alejandro

Upvotes: 0

Views: 940

Answers (1)

Dennis Williamson
Dennis Williamson

Reputation: 360733

Assuming a sorted value and that column 5 is always either 0 or the same non-zero value for each line in a group and that there are no lines to be kept with column 5 equal to 5:

awk '$5 != 0 {key = $1 $2 $3 $4 $5; if (prev != key) {print saved}; prev = key; saved = $0} END {print saved}' inputfile

Upvotes: 3

Related Questions