obtain averages of field 2 after grouping by field 1 with awk

Question

I have a file with two fields containing numbers that I have sorted numerically based on field 1. The numbers in field 1 range from 1 to 200000 and the numbers in field 2 between 0 and 1. I want to obtain averages for both field 1 and field 2 in batches (based on rows).

Here is example input output when specifying batches of 4 rows:

The output would be:

1.5 0.327
563.25  0.247
1899.75 0.45

janos · Accepted Answer

Here you go:

awk -v n=4 '{s1 += $1; s2 += $2; if (++i % n == 0) { print s1/n, s2/n; s1=s2=0; } }'

Explanation:

Initialize n=4, the size of the batches
Collect the sums: sum of 1st column in s1, the 2nd in s2
Increment counter i by 1 (default initial value is 0, no need to set it)
If i is divisible by n with no remainder, then we print the averages, and reset the sum variables

obtain averages of field 2 after grouping by field 1 with awk

Answers (1)

Related Questions