Eric Walters
Eric Walters

Reputation: 311

How to calculate median for each row from file in bash shell script

I am trying to calculate the median for each row from a file in my bash shell script. I believe there is a way to achieve this by using a pipelined command of cut, sort, head, and tail but I cannot figure out how to integrate this process into the code that I have existing. I want to calculate the median at the same point where I am calculating the mean. What is the best way to do this?

while read i
do
    sum=0
    count=0
    mean=0
    median=0
    for num in $i
    do
        sum=$(($sum + $num))
        count=`expr $count + 1`
        mean=`expr $sum / $count`
        #Need to calculate the median
    done
    echo "Sum: $sum Mean: $mean"
done < $2

Upvotes: 3

Views: 2793

Answers (3)

user3408541
user3408541

Reputation: 63

The only reason to use bash is because Perl is not installed! This script will find the median of each row of numbers

#!/usr/bin/perl -w

#input file contains rows of random numbers
#find the median of each row of numbers

my @row;
my @sortedRow;
my $rowSize;
my $rowMedian;
my $middleIndex;

while(<>){
  @row = split(/\s+/);
  $rowSize = @row;
  @sortedRow = sort @row;
  $middleIndex = $rowSize / 2;             #this will be a float for odd numbers
  if($rowSize % 2 ==0){                    #even, median is avg middleIndex, middleIndex-1
    $rowMedian = ($sortedRow[$middleIndex] + $sortedRow[$middleIndex-1]) / 2;
  }else {                                  #odd median is middleIndex
    $rowMedian = $sortedRow[$middleIndex]; #perl will implicitly floor this float
  }
  print "Median: $rowMedian \"@row\"\n";
}

Output looks like this

$ perl median.pl numbers.txt
Median: 2 "1 2 3"
Median: 7 "7 5 9"
Median: 2 "2 2 9"
Median: 5 "5 4 5"
Median: 6 "7 2 6"

Golfed at 64 characters :)

perl -ne '@r=split /\s+/;print @r%2?$r[@r/2]:($r[@r/2-1]+$r[@r/2])/2,"\n";' numbers.txt

Upvotes: 0

agc
agc

Reputation: 8406

Assuming the rows are of variable length:

  1. Using bash and datamash:

    while read x
    do    tr -s '\t' '\n' <<< "$x" | \
          datamash  median 1
    done < file
    
  2. Using numaverage:

    while read x
    do    tr -s '\t' '\n' <<< "$x" | \
          numaverage -M
    done < file
    

Upvotes: 4

karakfa
karakfa

Reputation: 67507

awk to the rescue!

awk '{sum=0; 
      n=split($0,a); 
      for(i=1;i<=n;i++) sum+=a[i]; 
      asort(a); 
      median=n%2?a[n/2+1]:(a[n/2]+a[n/2+1])/2; 
      print sum, sum/n, median}' file

bash is not the right tool for this task.

Upvotes: 2

Related Questions