RIXS
RIXS

Reputation: 115

How to find maximum value in a column with awk

I have a file with two sets of data divided by a blank line:

a     3  
b     2  
c     1 

e     5   
d     8  
f     1  

Is there a way to find the maximum value of the second column in each set and print the corresponding line with awk ? The result should be:

b 3  
d 8  

Thank you.

Upvotes: 2

Views: 568

Answers (5)

RavinderSingh13
RavinderSingh13

Reputation: 133428

Could you please try following, written and tested based on your shown samples in GNU awk.

awk '
!NF{
  if(max!=""){ print arr[max],max }
           max=""
  }
{
    max=( (max<$2) || (max=="") ? $2 : max )
    arr[$2]=$1
}
END{
  if(max!=""){ print arr[max],max }
}
'  Input_file

Explanation: Adding detailed explanation for above.

awk '                            ##Starting awk program from here.
!NF{                             ##if NF is NULL then do following.
  if(max!=""){ print arr[max],max }  ##Checking if max is SET then print arr[max] and max.
  max=""                         ##Nullifying max here.
  }
{
  max=( (max<$2) || (max=="") ? $2 : max )      ##Checking condition if max is greater than 2nd field then keep it as max or change max value as 2nd field.
  arr[$2]=$1                     ##Creating arr with 2nd field index and 1st field as value.
}
END{                             ##Starting END block of this program from here.
  if(max!=""){ print arr[max],max }  ##Checking if max is SET then print arr[max] and max.
}
' Input_file                     ##mentioning Input_file name here.

Upvotes: 3

karakfa
karakfa

Reputation: 67467

another awk

$ awk '{cmd="sort -k2nr | head -1"} !NF{close(cmd)} {print | cmd}' file

a     3
d     8

runs the command for each block to find the block max.

Upvotes: 1

James Brown
James Brown

Reputation: 37394

Another awk:

$ awk '
!$0 {
    print n
    m=n=""
}
$2>m {
    m=$2
    n=$0
}
END {
    print n
}' file

Output:

a     3  
d     8  

Upvotes: 1

anubhava
anubhava

Reputation: 784898

You may use this alternate gnu awk:

awk -v RS= '{
   max=""
   split($0, a, /[^[:space:]]+/, m)
   for (i=1; i in m; i+=2)
      if (!max || m[i+1] > max) {
         mi = i
         max = m[i+1]
      }
      print m[mi], m[mi+1]
}' file

a 3
d 8

Upvotes: 1

Andrea
Andrea

Reputation: 125

You could try to separate the data sets by doing:
awk -v RS= 'NR == 1 {print}' yourfile > anotherfile
This will return the first data set then you change NF == 2 to get the second data set,
and then find the maximum in each data set like suggested in here

Upvotes: -1

Related Questions