gusgrave
gusgrave

Reputation: 45

Fail to cycle multiple input files with awk/gawk

I have a ton of files in subfolders, each containing three columns of numbers. I need to locate the largest number in $2 and then print columns $1 and $2.

This is what I got:

awk 'FNR > 1 {max=dist=0; if($2>max){dist=$1; max=$2}}END{print FILENAME "   distance: " dist "   max: " max}' ./nVT_*K/rdf_rdf_aam_aam_COM.dat

This works, however only prints values for the last input file. I need one from each.

Iterating using a bash for loop produced a "command not found" for the awk part. I am currently piping the echoed for loop output to a file and running as a script, though this is not a feasible plan in the long run.

Can anyone help toss this around so that it can take a bunch of input files in different subfolders and printing the intended result from each file as such:

./nVT_277K/rdf_rdf_aam_aam_COM.dat   distance: 4.650000   max: 1.949975
./nVT_283K/rdf_rdf_aam_aam_COM.dat   distance: 4.650000   max: 1.943047
./nVT_289K/rdf_rdf_aam_aam_COM.dat   distance: 4.650000   max: 1.907280
...
...
...

I'd be extremely grateful for any input here. Thanx

Upvotes: 3

Views: 74

Answers (2)

karakfa
karakfa

Reputation: 67507

assuming there is at least one positive value (so that we don't need to initialize)

$ awk 'FNR==1    {f=FILENAME}
       $2>max[f] {max[f]=$2; dist[f]=$1} 
       END       {for(f in max) print f, "distance:", dist[f], "max:", max[f]}' files

max and distance are indexed by filenames, since has to be unique within given path...

Upvotes: 0

Ed Morton
Ed Morton

Reputation: 203995

With GNU awk for ENDFILE:

awk '
    FNR > 1 { if ((max=="") || ($2>max)) {dist=$1; max=$2} }
    ENDFILE { print FILENAME "   distance: " dist "   max: " max; max=dist="" }
' ./nVT_*K/rdf_rdf_aam_aam_COM.dat

With any awk and assuming your inputs files are not empty:

awk '
    FNR==1 { if (NR>1) print fname "   distance: " dist "   max: " max; max=dist=""; fname=FILENAME; next }
    (max=="") || ($2>max) {dist=$1; max=$2} }
    END { print fname "   distance: " dist "   max: " max }
' ./nVT_*K/rdf_rdf_aam_aam_COM.dat

Upvotes: 1

Related Questions