Reputation: 73
If I create a text file containing the following lines:
>TESTTEXT_10000000
>TESTTEXT_1000000
>TESTTEXT_10000002
>TESTTEXT_10000001
and perform sort myfile
, my output is
>TESTTEXT_1000000
>TESTTEXT_10000000
>TESTTEXT_10000001
>TESTTEXT_10000002
However, if I append /1 and /2 to my lines the sort output changes drastically, and I do not know why.
Input:
>TESTTEXT_10000000/1
>TESTTEXT_1000000/1
>TESTTEXT_10000002/1
>TESTTEXT_10000001/1
Output:
>TESTTEXT_10000000/1
>TESTTEXT_1000000/1
>TESTTEXT_10000001/1
>TESTTEXT_10000002/1
Input:
>TESTTEXT_10000000/2
>TESTTEXT_1000000/2
>TESTTEXT_10000002/2
>TESTTEXT_10000001/2
Output:
>TESTTEXT_10000000/2
>TESTTEXT_10000001/2
>TESTTEXT_1000000/2
>TESTTEXT_10000002/2
Is the forward slash being recognised as a seperator? using --field-sperator did not alter the behaviour. If so, why is 1000000/2 in between the 1000001/2 and 1000002/2 entries? Using the human sort, numeric sort or other options never brought about consistency. Can anyone help me out here?
:edit:
Because it seems to be relevant, considering the answers, the value of LC_ALL on this machine is en_GB.UTF-8
Upvotes: 5
Views: 222
Reputation: 47099
/
is before 0
in your locale. Using LC_ALL=C
or other locale will properly not change anything.
In your use case you would properly be able to use -V
ersion sort:
sort -V myfile
Alternative can you specify the separator and keys to sort on:
sort -t/ -k1,1 myfile
Upvotes: 3