Reputation: 45
I have a ton of files in subfolders, each containing three columns of numbers. I need to locate the largest number in $2 and then print columns $1 and $2.
This is what I got:
awk 'FNR > 1 {max=dist=0; if($2>max){dist=$1; max=$2}}END{print FILENAME " distance: " dist " max: " max}' ./nVT_*K/rdf_rdf_aam_aam_COM.dat
This works, however only prints values for the last input file. I need one from each.
Iterating using a bash for loop produced a "command not found" for the awk part. I am currently piping the echoed for loop output to a file and running as a script, though this is not a feasible plan in the long run.
Can anyone help toss this around so that it can take a bunch of input files in different subfolders and printing the intended result from each file as such:
./nVT_277K/rdf_rdf_aam_aam_COM.dat distance: 4.650000 max: 1.949975
./nVT_283K/rdf_rdf_aam_aam_COM.dat distance: 4.650000 max: 1.943047
./nVT_289K/rdf_rdf_aam_aam_COM.dat distance: 4.650000 max: 1.907280
...
...
...
I'd be extremely grateful for any input here. Thanx
Upvotes: 3
Views: 74
Reputation: 67507
assuming there is at least one positive value (so that we don't need to initialize)
$ awk 'FNR==1 {f=FILENAME}
$2>max[f] {max[f]=$2; dist[f]=$1}
END {for(f in max) print f, "distance:", dist[f], "max:", max[f]}' files
max and distance are indexed by filenames, since has to be unique within given path...
Upvotes: 0
Reputation: 203995
With GNU awk for ENDFILE:
awk '
FNR > 1 { if ((max=="") || ($2>max)) {dist=$1; max=$2} }
ENDFILE { print FILENAME " distance: " dist " max: " max; max=dist="" }
' ./nVT_*K/rdf_rdf_aam_aam_COM.dat
With any awk and assuming your inputs files are not empty:
awk '
FNR==1 { if (NR>1) print fname " distance: " dist " max: " max; max=dist=""; fname=FILENAME; next }
(max=="") || ($2>max) {dist=$1; max=$2} }
END { print fname " distance: " dist " max: " max }
' ./nVT_*K/rdf_rdf_aam_aam_COM.dat
Upvotes: 1