Alucard
Alucard

Reputation: 17278

How do I calculate the mean of a column

Anyone know how can I calculate the mean of one these columns (on linux)??

sda               2.91    20.44    6.13    2.95   217.53   186.67    44.55     0.84   92.97
sda               0.00     0.00    2.00    0.00    80.00     0.00    40.00     0.22  110.00 
sda               0.00     0.00    2.00    0.00   144.00     0.00    72.00     0.71  100.00 
sda               0.00    64.00    0.00    1.00     0.00     8.00     8.00     2.63   10.00
sda               0.00     1.84    0.31    1.38    22.09   104.29    74.91     3.39 2291.82 
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00  

For example: mean(column 2)

Upvotes: 44

Views: 67987

Answers (5)

Chris Koknat
Chris Koknat

Reputation: 3451

Perl solution:

perl -lane '$total += $F[1]; END{print $total/$.}' file

-a autosplits the line into the @F array, which is indexed starting at 0
$. is the line number

If your fields are separated by commas instead of whitespace:

perl -F, -lane '$total += $F[1]; END{print $total/$.}' file

To print mean values of all columns, assign totals to array @t:

perl -lane 'for $c (0..$#F){$t[$c] += $F[$c]}; END{for $c (0..$#t){print $t[$c]/$.}}' 

output:

0
0.485
14.38
1.74
0.888333333333333
77.27
49.8266666666667
39.91
1.29833333333333
434.131666666667

Upvotes: 4

Tom
Tom

Reputation: 41

Simple-r will calculate the mean with the following line:

r -k2 mean file.txt

for the second column. It can also do much more sophisticated statistical analysis, since it uses R environment for all of its statistical analysis.

Upvotes: 0

OscarRyz
OscarRyz

Reputation: 199215

David Zaslavsky for the fun of it:

with open("mean.txt", 'r') as f: 
    n,t = map(sum, zip(*((1, float(line.split()[1])) for line in f)))
print t/n

Upvotes: 0

OscarRyz
OscarRyz

Reputation: 199215

You can use python for that, is available in Linux.

If that comes from a file, take a look at this question, just use float instead.

For instance:

#mean.py 
def main():
    with open("mean.txt", 'r') as f:
        data = [map(float, line.split()) for line in f]

    columnTwo = []
    for row in data:
        columnTwo.append( row[1] )

    print  sum(columnTwo,0.0) / len( columnTwo )



if __name__=="__main__":
    main()

Prints 14.38

I just include the data in the mean.txt file, not the row header: "sda"

Upvotes: 1

porges
porges

Reputation: 30580

Awk:

awk '{ total += $2 } END { print total/NR }' yourFile.whatever

Read as:

  • For each line, add column 2 to a variable 'total'.
  • At the end of the file, print 'total' divided by the number of records.

Upvotes: 106

Related Questions