Reputation: 107

average of every nth lines bash

I was not sure how to formulate the question but here it is.

I have a long file with 12/24/36/48... lines.

The file looks like this.

Now what I wanna do is to average all lines beginning with 0... so the for example 413+412/2 for the 0 line, then every line beginning with 1 and so on ... till 11. so the output would have only 12 lines with averages of every nth line.

I'm really struggling. I know how to awk every line beginning with a number but gets a little but confusing there.

Upvotes: 2

Answers (3)

David C. Rankin

Reputation: 84531

Bash provides an easy solution (updated to keep individual count of each index 0 .. 11). An additional update was provided setting the integer attribute for the arrays allowing a more succinct increment of values within arithmetic operators:

#!/bin/bash

[ -n "$1" -a -f "$1" ] || {     # test filename provided & is readable
    printf "\n Error: invalid input. Usage:  %s <input_file>\n\n" "${0//*\//}"
    exit 1
}

declare -ai cnt      # count of how many times 0..11 encountered
declare -ai sum      # array holding running total of each 0 .. 11

while read -r idx val || [ -n "$val" ]; do      # read each line
    ((sum[idx]+=val))                           # keep sum of each 0 .. 11
    ((cnt[idx]++))                              # keep cnt of each 0 .. 11
done <"$1"

## for each element in the array, compute average and print (using bc for division)
printf "\nThe sum and averages of each line index are:\n\n"
for ((i=0; i<"${#sum[@]}"; i++)); do
    printf "  %4s  %8s / %-3s = %s\n" "$i" "${sum[i]}" "${cnt[i]}" "$(printf "%.3f" $(printf "scale=4;${sum[i]}/${cnt[i]}\n" | bc) )"
done

exit 0

output:

$ bash avgnthln.sh dat/avgln.dat

The sums and averages of each line index are:

     0       825 / 2   = 412.500
     1       758 / 2   = 379.000
     2       526 / 2   = 263.000
     3       544 / 2   = 272.000
     4        84 / 2   = 42.000
     5       103 / 2   = 51.500
     6       814 / 2   = 407.000
     7      1068 / 2   = 534.000
     8      1887 / 2   = 943.500
     9      1969 / 2   = 984.500
    10       752 / 2   = 376.000
    11      1982 / 2   = 991.000

Upvotes: 1

Etan Reisner

Reputation: 80921

awk '{sum[$1]=sum[$1] + $2; nr[$1]++} END {for (a in sum) {print a, sum[a]/nr[a]}}' file

Keep a running sum of the second field indexed by the first field. Also count how many of each first field you see. Then loop over all the seen fields and print out the field and the average.

If you want the output in order you can pipe to sort or use a numeric loop in the END block (if you know the minimum/maximum values ahead of time). You could also keep the max value in the main action block and use that but this was simpler.

Upvotes: 3

Gilles Quénot

Reputation: 184995

awk '$1 == 0{c++;r+=$2}END{print r/c}' file
412.5

Feel free to improve it for other lines...

Upvotes: 3

average of every nth lines bash

Answers (3)

Related Questions