Reputation: 69

awk compare consecutive rows

This is quite simple maybe, but I am stacked.Thanks for any help. I have an input file with two two columns. In one column I have an ID and in the second a value associated to it. I need an output where the first column will be the ID (no repetitions are allowed) and in the second column the average is printed. The ids are not always repeated, and if repeated it could only be consecutively and at a max repetition value of two.

Input

Desired output

Upvotes: 0

Answers (3)

cadrian

Reputation: 7376

(written live, I did not try it; assume GNU awk; assume sorted input)

awk -F';' '
    BEGIN {
        id=""
    }
    $1 != id {
        if (id != "") {
            printf("%s;%d\n", id, sum/n);
            n = sum = 0;
            id = str($1);
        }
        sum += $2;
        n++;
    }
    END {
        if (n > 0) printf("%s;%s\n", id, sum/n);
    }
'

Upvotes: 0

Ed Morton

Reputation: 203532

$ cat tst.awk
BEGIN { FS=OFS=";" }
($1 != prev) && (NR>1) { print prev, sum/cnt; sum=cnt=0 }
{ prev=$1; sum+=$2; cnt++ }
END { if (cnt) print prev, sum/cnt }

$ awk -f tst.awk file
10;15
20;35
30;15
40;11

Upvotes: 3

Kent

Reputation: 195059

This one-liner does it:

awk -F';' -v OFS=";" '{a[$1]+=$2+0;b[$1]++}END{for(x in a)print x,a[x]/b[x]}' file

Test with your data:

kent$  cat f
10;10
10;20
20;30
20;40
30;15
40;10
40;12

kent$  awk -F';' -v OFS=";" '{a[$1]+=$2+0;b[$1]++}END{for(x in a)print x,a[x]/b[x]}' f
10;15
20;35
30;15
40;11

Upvotes: 4

awk compare consecutive rows

Answers (3)

Related Questions