Reputation: 45
Say I have the following 3 columns in text file:
1 003 3
2 006 1
3 005 4
4 001 2
5 006 7
6 002 2
7 004 3
8 001 6
9 002 8
10 005 2
I want to output 3 columns:
Starting from after the first one. So from that input, the output would be:
1 003 3
4 005 4
7 006 7
10 002 8
What I tried:
awk \
'BEGIN{
cnt=3;
max=0;
};
{
if (cnt == 3){
cnt++;
max_arr[cnt]=$3;
for (i in max_arr){
if (max_arr[i] > max)
{ max = max_arr[i] }
}
printf "%s %s %s\n", $1,$2,max;
cnt=1;
delete max_arr;
max=0;
}
else{
cnt++;
max_arr[cnt]=$3;
}
}' input_file.txt
This gives me:
1 003 3
4 001 4
7 004 7
10 005 8
Column 1 and 3 are correct, but 2 is wrong.
Upvotes: 1
Views: 837
Reputation: 203169
This is how you do it robustly:
$ cat tst.awk
{
isBlockBeg = ( (NR%3)==2 )
isBlockEnd = ( (NR%3)==1 )
}
isBlockBeg { max=$3 }
$3 >= max { max=$3; val=$2 }
isBlockEnd { print $1, val, max }
END { if (!isBlockEnd) print $1, val, max }
$ awk -f tst.awk file
1 003 3
4 005 4
7 006 7
10 002 8
Note that the above will work whether your data is numbers or strings, whether or not your data is all-negative, and even if your data doesn't end nicely at the end of a block of 3. If you don't need that last part, you can reduce it to just:
$ cat tst.awk
(NR%3)==2 { max=$3 }
$3 >= max { max=$3; val=$2 }
(NR%3)==1 { print $1, val, max }
$ awk -f tst.awk file
1 003 3
4 005 4
7 006 7
10 002 8
Upvotes: 2
Reputation: 5062
You could try the following awk script :
# file : script.awk
# if max[1] is uninitialized OR ...
# if the 3rd field of our current line is > than the one stored in our max array ...
# we store the 2nd and 3rd field of our line in the array
!(1 in max) || max[1]<$3 { max[0]=$2; max[1]=$3; }
# if the remainder of our line_number / 3 == 1 (lines 4, 7, 10, ...)
NR % 3 == 1 {
# we print the line_number, and the 2 max values
print NR,max[0],max[1]
# we delete the old array
delete max
}
You can then call it like this : awk -f script.awk data
Sample input :
> cat data
1 003 3
2 006 1
3 005 4
4 001 2
5 006 7
6 002 2
7 004 3
8 001 6
9 002 8
10 005 2
Sample output :
> awk -f script.awk data
1 003 3
4 005 4
7 006 7
10 002 8
Upvotes: 2
Reputation: 67467
if $3 values are all positive...
$ awk '$3>m3 {m3=$3; v2=$2}
NR%3==1 {print $1,v2,m3; m3=0}' file
1 003 3
4 005 4
7 006 7
10 002 8
Upvotes: 1
Reputation: 13239
A shorter awk script could be this one:
awk 'm<$3{m=$3;n=$2} !((NR+2)%3){print $1,n,m;m=n=""}' file
where the max value of column 3 is m
, the corresponding value of column 2 is n
.
The statement !((NR+2)%3)
is executed for the first line and every next 3 lines, which print the wanted value and unset both the max value of column 3 m
and n
.
Upvotes: 2