Average of multiple files with unequal row sizes in Shell

Question

I have 15 datafiles with unequal row sizes, but number of columns in each file is same. e.g.

ifile1.dat   ifile2.dat  ifile3.dat and so on ............
  0   0        0   0       1   6        
  1   2        5   3       2   7
  2   5        6   10      4   6
  5   2        8   9       5   9
  10  2        10  3       8   2

In each file 1st column represents the index number. I would like to compute average of all these files for each index number in column 1. i.e.

ofile.txt
 0   0     [This is computed as (0+0)/2]
 1   4     [This is computed as (2+6)/2]
 2   6     [This is computed as (5+7)/2]
 3         [no value]
 4   6     [This is computed as (6)/1]
 5   4.66  [This is computed as (2+3+9)/3]
 6   10
 7   
 8   5.5
 9   
 10  2.5

I can't think of any simple method to do it. I was thinking of a method, but seems very lengthy. Taking the average after converting all the files with same row size, .e.g.

ifile1.dat   ifile2.dat  ifile3.dat and so on ............ 
  0   0        0   0       0   0          
  1   2        1           1   6  
  2   5        2           2   7  
  3            3           3 
  4            4           4   6
  5   2        5   3       5   9
  6            6   10      6
  7            7           7
  8            8   9       8   2
  9            9           9
  10  2        10  3       10

John1024 · Accepted Answer

$ awk '{s[$1]+=$2; c[$1]++;} END{for (i in s) print i,s[i]/c[i];}' ifile*.dat
0 0
1 4
2 6
4 6
5 4.66667
6 10
8 5.5
10 2.5

In the above code, there are two arrays, s and c. s[i] is the sum of all entries with index i and c[i] is the number of entries with index i. After we have read all the files, we print the average, s[i]/c[i], for each index i.

Average of multiple files with unequal row sizes in Shell

Answers (1)

Related Questions