Roy
Roy

Reputation: 945

Why "sort" behaves weirdly when "LANG" is not set?

In my bash, "LANG" was set to "en_us.UTF-8", and "sort" works fine. But if I unset "LANG", then "sort" works weirdly. Even with "LC_ALL=C" or "LC_ALL=POSIX". Anyone know why "sort" doesn't work when "LANG" is not set?

$ echo $LANG
en_US.UTF-8

$ sort -t$'\t' -k1,3 -gr
-4.445905   1   0.965933
-4.445905   1   0.76126
-4.445905   1   0.74816
-4.445905   1   0.633251
-4.445905   1   0.596921
-4.445905   1   0.464271
-4.445905   1   0.406553
-4.445905   1   0.350968
-4.445905   1   0.307701
-4.445905   1   0.188411
-4.445905   1   0.0377354
-4.445905   1   0.0221152
-4.445905   -1  0.999663
-4.445905   -1  0.987287
-4.445905   -1  0.97882
-4.445905   -1  0.969835
-4.445905   -1  0.96705
-4.445905   -1  0.964982
-4.445905   -1  0.920791
-4.445905   -1  0.901001
-4.445905   -1  0.877351
-4.445905   -1  0.87702

-4.445905   -1  0.999663
-4.445905   -1  0.987287
-4.445905   -1  0.97882
-4.445905   -1  0.969835
-4.445905   -1  0.96705
-4.445905   1   0.965933
-4.445905   -1  0.964982
-4.445905   -1  0.920791
-4.445905   -1  0.901001
-4.445905   -1  0.877351
-4.445905   -1  0.87702
-4.445905   1   0.76126
-4.445905   1   0.74816
-4.445905   1   0.633251
-4.445905   1   0.596921
-4.445905   1   0.464271
-4.445905   1   0.406553
-4.445905   1   0.350968
-4.445905   1   0.307701
-4.445905   1   0.188411
-4.445905   1   0.0377354
-4.445905   1   0.0221152

$ unset LANG

$ sort -t$'\t' -k1,3 -gr
-4.445905   1   0.965933
-4.445905   1   0.76126
-4.445905   1   0.74816
-4.445905   1   0.633251
-4.445905   1   0.596921
-4.445905   1   0.464271
-4.445905   1   0.406553
-4.445905   1   0.350968
-4.445905   1   0.307701
-4.445905   1   0.188411
-4.445905   1   0.0377354
-4.445905   1   0.0221152
-4.445905   -1  0.999663
-4.445905   -1  0.987287
-4.445905   -1  0.97882
-4.445905   -1  0.969835
-4.445905   -1  0.96705
-4.445905   -1  0.964982
-4.445905   -1  0.920791
-4.445905   -1  0.901001
-4.445905   -1  0.877351
-4.445905   -1  0.87702

-4.445905   1   0.965933
-4.445905   1   0.76126
-4.445905   1   0.74816
-4.445905   1   0.633251
-4.445905   1   0.596921
-4.445905   1   0.464271
-4.445905   1   0.406553
-4.445905   1   0.350968
-4.445905   1   0.307701
-4.445905   1   0.188411
-4.445905   1   0.0377354
-4.445905   1   0.0221152
-4.445905   -1  0.999663
-4.445905   -1  0.987287
-4.445905   -1  0.97882
-4.445905   -1  0.969835
-4.445905   -1  0.96705
-4.445905   -1  0.964982
-4.445905   -1  0.920791
-4.445905   -1  0.901001
-4.445905   -1  0.877351
-4.445905   -1  0.87702

$ LC_ALL=POSIX sort -t$'\t' -k1,3 -gr
-4.445905   1   0.965933
-4.445905   1   0.76126
-4.445905   1   0.74816
-4.445905   1   0.633251
-4.445905   1   0.596921
-4.445905   1   0.464271
-4.445905   1   0.406553
-4.445905   1   0.350968
-4.445905   1   0.307701
-4.445905   1   0.188411
-4.445905   1   0.0377354
-4.445905   1   0.0221152
-4.445905   -1  0.999663
-4.445905   -1  0.987287
-4.445905   -1  0.97882
-4.445905   -1  0.969835
-4.445905   -1  0.96705
-4.445905   -1  0.964982
-4.445905   -1  0.920791
-4.445905   -1  0.901001
-4.445905   -1  0.877351
-4.445905   -1  0.87702

-4.445905   1   0.965933
-4.445905   1   0.76126
-4.445905   1   0.74816
-4.445905   1   0.633251
-4.445905   1   0.596921
-4.445905   1   0.464271
-4.445905   1   0.406553
-4.445905   1   0.350968
-4.445905   1   0.307701
-4.445905   1   0.188411
-4.445905   1   0.0377354
-4.445905   1   0.0221152
-4.445905   -1  0.999663
-4.445905   -1  0.987287
-4.445905   -1  0.97882
-4.445905   -1  0.969835
-4.445905   -1  0.96705
-4.445905   -1  0.964982
-4.445905   -1  0.920791
-4.445905   -1  0.901001
-4.445905   -1  0.877351
-4.445905   -1  0.87702

Upvotes: 0

Views: 90

Answers (1)

Roy
Roy

Reputation: 945

Apparently I didn't correctly understand the sort key option "-k". It is actually "from POS1 to POS2 inclusive", so "-k 1,3" means sort using key column1+column2+column3. Thus the later two "sort" are doing the correct thing, while the first is actually weird.

Upvotes: 1

Related Questions