Reputation: 1
I have an apparently "simple" problem but I can't find the solution for some reason...
I have n millions files of different sizes and I want to find the average filesize.
To simplify it, I grouped them in multiples of 16KB.
< 16 KB = 18689546 files
< 32 KB = 1365713 files
< 48 KB = 1168186 files
...
Of course, the simple (total_size / number of files) does not work. It gives an average of 291KB...
What would be the algorithm to calculate the real average...?
Thx, JD
Upvotes: 0
Views: 769
Reputation: 340218
You might be running into a problem with overruns when summing the file sizes (the total size probably doesn't fit into a 32-bit value). The easiest fix might be to try using a 64-bit int for the variable that's holding the sum.
Upvotes: 1