daltojam
daltojam

Reputation: 17

fatal: division by zero attempted when trying to find mean?

I'm trying to find the mean of several numbers in a file, which contains "< Overall >" on the line.

My code:

awk -v file=$file '{if ($1~"<Overall>") {rating+=$1; count++;}} {rating=rating/count; print file, rating;}}' $file | sed 's/<Overall>//'

I'm getting

awk: cmd. line:1: (FILENAME=[file] FNR=1) fatal: division by zero attempted

for every file. I can't see why count would be zero if the file does contain a line such as "< Overall >5"

EDIT: Sample from the (very large) input file, as requested:

<Author>RW53
<Content>Location! Location?       view from room of nearby freeway 
<Date>Dec 26, 2008
<No. Reader>-1
<No. Helpful>-1
<Overall>3
<Value>4
<Rooms>3
<Location>2
<Cleanliness>4
<Check in / front desk>3
<Service>-1
<Business service>-1

Expected output:

[filename] X

Where X is the average of all the lines containing < Overall >

Upvotes: 0

Views: 2530

Answers (3)

chepner
chepner

Reputation: 530950

You aren't waiting until you've completely read the file to compute the average rating. This is simpler if you use patterns rather than an if statement. You also need to remove <Overall> before you attempt to increment rating.

awk '$1 ~ /<Overall>/ {rating+=sub("<Overall>", "", $1); count++;}
     END {rating=rating/(count?count:1); print FILENAME, rating;}' "$file"

(Answer has been updated to fix a typo in the call to sub and to correctly avoid dividing by 0.)

Upvotes: 1

NeronLeVelu
NeronLeVelu

Reputation: 10039

awk -F '>' '
   # separator of field if the >
   # for line that containt <Overall>
   /<Overall>/ {
       # evaluate the sum and increment counter
       Rate+=$2;Count++}
   # at end of the current file
   END{
      # print the average.
      printf( "[%s] %f\n", FILENAME, Rate / ( Count + ( ! Count  ) )
      }
   ' ${File}

# one liner
awk -F '>' '/<Overall>/{r+=$2;c++}END{printf("[%s] %f\n",FILENAME,r/(c+(!c))}' ${File}

Note:

  • ( c + ( ! c ) ) use a side effect of logical NOT (!). It value 1 if c = 0, 0 otherwise. So if c = 0 it add 1, if not it add 0 to itself insurring a division value of at least 1.
  • assume the full file reflect the sample for content

Upvotes: 0

Inian
Inian

Reputation: 85550

Use an Awk as below,

awk -F'<Overall>' 'NF==2 {sum+=$2; count++}
                   END{printf "[%s] %s\n",FILENAME,(count?sum/count:0)}' file

For an input file containing two <Overall> clauses like this, it produces a result as follows the file-name being input-file

<Author>RW53
<Content>Location! Location?       view from room of nearby freeway
<Date>Dec 26, 2008
<No. Reader>-1
<No. Helpful>-1
<Overall>3
<Value>4
<Rooms>3
<Location>2
<Cleanliness>4
<Check in / front desk>3
<Service>-1
<Business service>-1
<Overall>2

Running it produces,

[input-file] 2.5

The part, -F'<Overall>' splits input-lines with de-limiter as <Overall>, basically only the lines having <Overall> and the number after it will be filtered, the number being $2 which is summed up and stored in sum variable and count is tracked in c.

The END clause gets executed after all lines are printed which basically prints the filename using the awk special variable FILENAME which retains the name of the file processed and the average is calculated iff the count is not zero.

Upvotes: 4

Related Questions