Reputation: 443
I have a file with 3 types of sequences and their positions that are reoccurring as such:
seq1 2
seq1 5
seq1 10
seq3 15
seq3 34
seq3 60
seq2 100
seq2 110
seq2 200
seq3 210
seq3 250
seq3 300
seq1 310
seq1 330
seq1 400
The second value is always unique denoting a position and is sorted, hence why the sequences are scattered.
For every time a sequence starts, I want to grab the minimum and max of that sequence. Output should be (seq min max)
seq1 2 10
seq3 15 60
seq2 100 200
seq3 210 300
seq1 310 400
Is it possible to do this in bash with awk or anything else?
Upvotes: 1
Views: 72
Reputation: 8711
Another awk
$ awk ' { if(NR>1 && p!=$1) { print p,min,max; max=min=""} min=min?min:$2; max=$2; p=$1 }
END { print p,min,max } ' adrian.txt
seq1 2 10
seq3 15 60
seq2 100 200
seq3 210 300
seq1 310 400
$
Upvotes: 2
Reputation: 784948
You may use this awk
:
awk 'p != $1 {if (NR>1) print p, first, last; first=$2} {p=$1; last=$2}
END{print p, first, last}' file
seq1 2 10
seq3 15 60
seq2 100 200
seq3 210 300
seq1 310 400
Upvotes: 3