Reputation: 73
I have a sample file like
stat1: sample: 0.0
stat2: sample: 0.0
stat3: sample: 349.7
stat1: sample: 0.0
stat2: sample: 0.0
stat3: sample: 349.7
stat1: sample: 0.0
stat2: sample: 0.0
stat3: sample: 349.7
What I have to do is to find max and mean value for each stat.
Upvotes: 0
Views: 1766
Reputation: 47099
This is pretty straightforward using awk associative arrays:
meanmax.awk
# Skip empty lines
NF==0 { next }
# Keep a tally of number of elements and their sum
{ cnt[$1]++; sum[$1] += $3 }
# If max[$1] has not been defined or if it is smaller than $3
cnt[$1] == 1 || max[$1] < $3 {
max[$1] = $3
}
END {
for (k in sum)
print k, max[k], sum[k]/cnt[k]
}
Run it like this:
awk -f meanmax.awk infile
Output:
stat1: 0.0 0
stat2: 0.0 0
stat3: 349.7 349.7
Or if run on the originally posted input:
stat1: 18.0 10.3333
stat2: 45.0 16.6667
stat3: 239.7 89.4667
Upvotes: 1
Reputation: 97938
If you don't mind using Perl:
perl -lane '{ if(/(stat\d+)/) {
$m{$1} =$F[2] if $m{$1}<$F[2] || !$m{$_};
$s{$1}+=$F[2]; $c{$1}++
}} END{print "$_: $m{$_},".$s{$_}/$c{$_} for keys %c}' input
Upvotes: 1
Reputation: 753615
awk '{
if (NF != 3) next
sum[$1] += $3;
if (cnt[$1]++ == 0) { max[$1] = $3; min[$1] = $3; }
if ($3 > max[$1]) max[$1] = $3
if ($3 < min[$1]) min[$1] = $3
}
END {
printf "%-8s %4s %8s %8s %8s\n", "Sample", "N", "Minimum", "Maximum", "Average"
for (key in sum)
{
printf "%-8s %4d %8.2f %8.2f %8.2f\n", key, cnt[key], min[key], max[key], sum[key]/cnt[key]
}
}' data-file
Sample output (from the data in the question, which is singularly unexciting):
Sample N Minimum Maximum Average
stat2: 3 0.00 0.00 0.00
stat1: 3 0.00 0.00 0.00
stat3: 3 349.70 349.70 349.70
This code includes the minimum as well as the maximum; it's easy to remove if it is unwanted. Note that it skips blank lines in the data file.
Upvotes: 1
Reputation: 7610
If You need the mean and max of all values, You can try something like this
awk '/sample:/ {s[$1] += $3; if(++n[$1]==1 || max[$1]<$3) max[$1] = $3}
END { for (i in s) print i" mean = "s[i]/n[i]", max = "max[i] }
' <<EOT
stat1: sample: 0.0
stat2: sample: 0.0
stat3: sample: 349.7
stat1: sample: 0.0
stat2: sample: 0.0
stat3: sample: 349.7
stat1: sample: 0.0
stat2: sample: 0.0
stat3: sample: 349.7
EOT
Output:
stat3: mean = 349.7, max = 349.7
stat1: mean = 0, max = 0.0
stat2: mean = 0, max = 0.0
Upvotes: 1