Reputation: 724
Using as key columns 1 and 2, i want to delete all rows which the value increments by one.
input
1000 1001 140
1000 1002 140
1000 1003 140
1000 1004 140
1000 1005 140
1000 1006 140
1000 1201 140
1000 1202 140
1000 1203 140
1000 1204 140
1000 1205 140
2000 1002 140
2000 1003 140
2000 1004 140
2000 1005 140
2000 1006 140
output desired
1000 1001 140
1000 1006 140
1000 1201 140
1000 1205 140
2000 1002 140
2000 1006 140
I have tried
awk '{if (a[$1] < $2)a[$1]=$2;}END{for(i in a){print i,a[i];}}' <file>
But for some reason, it keeps only the maximum value.
Upvotes: 0
Views: 86
Reputation: 7837
Your problem statement doesn't describe your output. You want to print the first and last row of each contiguous range. Like this:
$ awk '$1 > A || $2 > B + 1 {
if(row){print row}; print}
{A=$1; B=$2; row=$0}
END {print}' dat
1000 1001 140
1000 1006 140
1000 1201 140
1000 1205 140
2000 1002 140
2000 1006 140
The basic problem is just to determine if a line is only 1 more than the prior one. The only way to do that is to have both lines to compare. By storing the value of each line as it's read, you can compare the current line to the prior one.
Upvotes: 1