Sreten Jocić
Sreten Jocić

Reputation: 185

awk count pattern matches and sum the numbers after them

I have a file with the following pattern that repeats many times:

<Content>Un relax en el ritmo trepidante de New York   showReview(14443615, 'full');
<Date>Mar 22, 2008
<Overall rating>3
<No. Reader>-1
<No. Helpful>-1
<Overall>5

So I'm trying to count the number of occurences of <Overall> without counting <Overall rating>. Then I want to sum the numbers that follow after each <Overall> tag. This is my code.

 awk -F'>' '$1=="<Overall" BEGIN{}
    {
        count++
        sum+=$2
    }
    END{printf "%.2f\n", sum/count}' *filename*

Upvotes: 1

Views: 1647

Answers (2)

karakfa
karakfa

Reputation: 67467

another approach

$ awk -F'<Overall>' 'NF==2 {sum+=$2; c++} 
                     END   {print (c?sum/c:0)}' file

Upvotes: 1

l&#39;L&#39;l
l&#39;L&#39;l

Reputation: 47169

If I understand correctly you would like to sum the values of all lines which contain <Overall>:

awk 'BEGIN{FS=">";sum=0} $0~/<Overall>/ {sum+=$2} END{print sum}' file

An example would be a file containing the following lines would sum to 175:

...
<Overall>25
<Overall>75
...
<Overall>50
...
<Overall>25
...

Upvotes: 0

Related Questions