Reputation: 13
I have a file called probabilities.txt and it's a two column file with the first column listing distances and the second column probabilities.
The sample data is as follows:
0.2 0.05
0.4 0.10
0.6 0.63
0.8 0.11
1.0 0.03
... ...
10.0 0.01
I would like to print out the line that has the maximum value in column 2. I've tried the following:
awk 'BEGIN{a= 0} {if ($2 > a) a = $2} END{print $1, a}' probabilities.txt
This was the desired output:
0.6 0.63
But this is the output I get:
10.0 0.63
It seems like the code I wrote is just getting the max value in each column and then printing it out rather than printing out the line that has the max value in column 2. Printing out $0 also just prints out the last line of the file.
I assume one could fix this by treating the lines as an array rather than a scalar but I'm not really sure how to do that since I'm a beginner. Would appreciate any help
Upvotes: 0
Views: 275
Reputation: 84521
I had contemplated just leaving the answer as a comment, but given the trouble you had with the command it's worth writing up. To begin, you don't need BEGIN
. In awk
variables are initialized 0
until set, so you can simply use a max
variable for the first time after comparing it.
Note: If your data involves negative numbers (neither distance or probabilities can), just add a new first rule and set max
to the value in the first record (e.g. FNR==1 (max=$2; next}
)
Next, you don't save individual field values when you are wanting to capture the entire line (record) with the largest probability, save the entire record associated with the max
value. Then in your END
rule all you need to do is print that record.
Putting it altogether you would have:
awk '{if($2 > max) {max=$2; maxline=$0}} END {print maxline}' file
or, if you prefer:
awk '$2 > max {max=$2; maxline=$0} END {print maxline}' file
Example Use/Output
With your data in the file distprobs.txt
you would get:
$ awk '{if($2 > max) {max=$2; maxline=$0}} END {print maxline}' distprobs.txt
0.6 0.63
and, second version same result:
$ awk '$2 > max {max=$2; maxline=$0} END {print maxline}' distprobs.txt
0.6 0.63
Upvotes: 2