Reputation: 45
I would like to sort the below file by the 2nd column, from the 7th position to the 9th position.
$ cat sample.bed
chr1 248956422 chr1:248956422
chr2 242193529 chr2:242193529
chr3 198295559 chr3:198295559
chr4 190214555 chr4:190214555
chr5 181538259 chr5:181538259
chr6 170805979 chr6:170805979
chr7 159345973 chr7:159345973
chrX 156040895 chrX:156040895
chr8 145138636 chr8:145138636
chr9 138394717 chr9:138394717
I use sort as shown and get the below output:
$ sort -n -k2.7,2.9 sample.bed
chr4 190214555 chr4:190214555
chr6 170805979 chr6:170805979
chr5 181538259 chr5:181538259
chr2 242193529 chr2:242193529
chr8 145138636 chr8:145138636
chrX 156040895 chrX:156040895
chr3 198295559 chr3:198295559
chr9 138394717 chr9:138394717
chr1 248956422 chr1:248956422
chr7 159345973 chr7:159345973
Sort changes the row order, but not based on my parameters. Note that sort -k2,2 works as expected:
$ sort -k2,2 sample.bed
chr9 138394717 chr9:138394717
chr8 145138636 chr8:145138636
chrX 156040895 chrX:156040895
chr7 159345973 chr7:159345973
chr6 170805979 chr6:170805979
chr5 181538259 chr5:181538259
chr4 190214555 chr4:190214555
chr3 198295559 chr3:198295559
chr2 242193529 chr2:242193529
chr1 248956422 chr1:248956422
I must be missing something obvious... Any help would be greatly appreciated.
Upvotes: 2
Views: 4070
Reputation: 140960
The output of sort --debug
is very informative:
# sort -n -k2.7,2.9 --debug
...
chr4 190214555 chr4:190214555
___
______________________________________
...
It compares 021
from the first chr4
line, because it counts the leading blanks as belonging to the field. You can:
sort -n -k2.11,2.13
or ignore leading blanks with -b
:
sort -b -n -k2.7,2.9
Upvotes: 5